{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Ensembling\n",
    "\n",
    "*Adapted from Chapter 8 of [An Introduction to Statistical Learning](http://www-bcf.usc.edu/~gareth/ISL/)*"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Why are we learning about ensembling?\n",
    "\n",
    "- Very popular method for improving the predictive performance of machine learning models\n",
    "- Provides a foundation for understanding more sophisticated models"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Lesson objectives\n",
    "\n",
    "Students will be able to:\n",
    "\n",
    "- Define ensembling and its requirements\n",
    "- Identify the two basic methods of ensembling\n",
    "- Decide whether manual ensembling is a useful approach for a given problem\n",
    "- Explain bagging and how it can be applied to decision trees\n",
    "- Explain how out-of-bag error and feature importances are calculated from bagged trees\n",
    "- Explain the difference between bagged trees and Random Forests\n",
    "- Build and tune a Random Forest model in scikit-learn\n",
    "- Decide whether a decision tree or a Random Forest is a better model for a given problem"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Part 1: Introduction\n",
    "\n",
    "Let's pretend that instead of building a single model to solve a binary classification problem, you created **five independent models**, and each model was correct about 70% of the time. If you combined these models into an \"ensemble\" and used their majority vote as a prediction, how often would the ensemble be correct?"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[0 1 1 1 1 0 0 1 1 1 1 1 1 1 1 1 1 0 1 1]\n",
      "[1 1 1 1 1 1 1 0 1 0 0 0 1 1 1 0 1 0 0 0]\n",
      "[1 1 1 1 0 1 1 0 0 1 1 1 1 1 1 1 1 0 1 1]\n",
      "[1 1 0 0 0 0 1 1 0 1 1 1 1 1 1 0 1 1 1 0]\n",
      "[0 0 1 0 0 0 1 0 1 0 0 0 1 1 1 1 1 1 1 1]\n"
     ]
    }
   ],
   "source": [
    "import numpy as np\n",
    "\n",
    "# set a seed for reproducibility\n",
    "np.random.seed(1234)\n",
    "\n",
    "# generate 1000 random numbers (between 0 and 1) for each model, representing 1000 observations\n",
    "mod1 = np.random.rand(1000)\n",
    "mod2 = np.random.rand(1000)\n",
    "mod3 = np.random.rand(1000)\n",
    "mod4 = np.random.rand(1000)\n",
    "mod5 = np.random.rand(1000)\n",
    "\n",
    "# each model independently predicts 1 (the \"correct response\") if random number was at least 0.3\n",
    "preds1 = np.where(mod1 > 0.3, 1, 0)\n",
    "preds2 = np.where(mod2 > 0.3, 1, 0)\n",
    "preds3 = np.where(mod3 > 0.3, 1, 0)\n",
    "preds4 = np.where(mod4 > 0.3, 1, 0)\n",
    "preds5 = np.where(mod5 > 0.3, 1, 0)\n",
    "\n",
    "# print the first 20 predictions from each model\n",
    "print preds1[:20]\n",
    "print preds2[:20]\n",
    "print preds3[:20]\n",
    "print preds4[:20]\n",
    "print preds5[:20]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[1 1 1 1 0 0 1 0 1 1 1 1 1 1 1 1 1 0 1 1]\n"
     ]
    }
   ],
   "source": [
    "# average the predictions and then round to 0 or 1\n",
    "ensemble_preds = np.round((preds1 + preds2 + preds3 + preds4 + preds5)/5.0).astype(int)\n",
    "\n",
    "# print the ensemble's first 20 predictions\n",
    "print ensemble_preds[:20]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "0.713\n",
      "0.665\n",
      "0.717\n",
      "0.712\n",
      "0.687\n"
     ]
    }
   ],
   "source": [
    "# how accurate was each individual model?\n",
    "print preds1.mean()\n",
    "print preds2.mean()\n",
    "print preds3.mean()\n",
    "print preds4.mean()\n",
    "print preds5.mean()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "0.841\n"
     ]
    }
   ],
   "source": [
    "# how accurate was the ensemble?\n",
    "print ensemble_preds.mean()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**Note:** As you add more models to the voting process, the probability of error decreases, which is known as [Condorcet's Jury Theorem](http://en.wikipedia.org/wiki/Condorcet%27s_jury_theorem)."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## What is ensembling?\n",
    "\n",
    "**Ensemble learning (or \"ensembling\")** is the process of combining several predictive models in order to produce a combined model that is more accurate than any individual model.\n",
    "\n",
    "- **Regression:** take the average of the predictions\n",
    "- **Classification:** take a vote and use the most common prediction, or take the average of the predicted probabilities\n",
    "\n",
    "For ensembling to work well, the models must have the following characteristics:\n",
    "\n",
    "- **Accurate:** they outperform the null model\n",
    "- **Independent:** their predictions are generated using different processes\n",
    "\n",
    "**The big idea:** If you have a collection of individually imperfect (and independent) models, the \"one-off\" mistakes made by each model are probably not going to be made by the rest of the models, and thus the mistakes will be discarded when averaging the models.\n",
    "\n",
    "There are two basic **methods for ensembling:**\n",
    "\n",
    "- Manually ensemble your individual models\n",
    "- Use a model that ensembles for you"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Part 2: Manual ensembling\n",
    "\n",
    "What makes a good manual ensemble?\n",
    "\n",
    "- Different types of **models**\n",
    "- Different combinations of **features**\n",
    "- Different **tuning parameters**"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "![Machine learning flowchart](images/crowdflower_ensembling.jpg)\n",
    "\n",
    "*Machine learning flowchart created by the [winner](https://github.com/ChenglongChen/Kaggle_CrowdFlower) of Kaggle's [CrowdFlower competition](https://www.kaggle.com/c/crowdflower-search-relevance)*"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Comparing manual ensembling with a single model approach\n",
    "\n",
    "**Advantages of manual ensembling:**\n",
    "\n",
    "- Increases predictive accuracy\n",
    "- Easy to get started\n",
    "\n",
    "**Disadvantages of manual ensembling:**\n",
    "\n",
    "- Decreases interpretability\n",
    "- Takes longer to train\n",
    "- Takes longer to predict\n",
    "- More complex to automate and maintain\n",
    "- Small gains in accuracy may not be worth the added complexity"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Part 3: Bagging\n",
    "\n",
    "The primary weakness of **decision trees** is that they don't tend to have the best predictive accuracy. This is partially due to **high variance**, meaning that different splits in the training data can lead to very different trees.\n",
    "\n",
    "**Bagging** is a general purpose procedure for reducing the variance of a machine learning method, but is particularly useful for decision trees. Bagging is short for **bootstrap aggregation**, meaning the aggregation of bootstrap samples.\n",
    "\n",
    "What is a **bootstrap sample**? A random sample with replacement:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[ 1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20]\n",
      "[ 6 12 13  9 10 12  6 16  1 17  2 13  8 14  7 19  6 19 12 11]\n"
     ]
    }
   ],
   "source": [
    "# set a seed for reproducibility\n",
    "np.random.seed(1)\n",
    "\n",
    "# create an array of 1 through 20\n",
    "nums = np.arange(1, 21)\n",
    "print nums\n",
    "\n",
    "# sample that array 20 times with replacement\n",
    "print np.random.choice(a=nums, size=20, replace=True)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**How does bagging work (for decision trees)?**\n",
    "\n",
    "1. Grow B trees using B bootstrap samples from the training data.\n",
    "2. Train each tree on its bootstrap sample and make predictions.\n",
    "3. Combine the predictions:\n",
    "    - Average the predictions for **regression trees**\n",
    "    - Take a vote for **classification trees**\n",
    "\n",
    "Notes:\n",
    "\n",
    "- **Each bootstrap sample** should be the same size as the original training set.\n",
    "- **B** should be a large enough value that the error seems to have \"stabilized\".\n",
    "- The trees are **grown deep** so that they have low bias/high variance.\n",
    "\n",
    "Bagging increases predictive accuracy by **reducing the variance**, similar to how cross-validation reduces the variance associated with train/test split (for estimating out-of-sample error) by splitting many times an averaging the results."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Manually implementing bagged decision trees (with B=10)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>price</th>\n",
       "      <th>year</th>\n",
       "      <th>miles</th>\n",
       "      <th>doors</th>\n",
       "      <th>vtype</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>22000</td>\n",
       "      <td>2012</td>\n",
       "      <td>13000</td>\n",
       "      <td>2</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>14000</td>\n",
       "      <td>2010</td>\n",
       "      <td>30000</td>\n",
       "      <td>2</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>13000</td>\n",
       "      <td>2010</td>\n",
       "      <td>73500</td>\n",
       "      <td>4</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>9500</td>\n",
       "      <td>2009</td>\n",
       "      <td>78000</td>\n",
       "      <td>4</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>9000</td>\n",
       "      <td>2007</td>\n",
       "      <td>47000</td>\n",
       "      <td>4</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>4000</td>\n",
       "      <td>2006</td>\n",
       "      <td>124000</td>\n",
       "      <td>2</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>6</th>\n",
       "      <td>3000</td>\n",
       "      <td>2004</td>\n",
       "      <td>177000</td>\n",
       "      <td>4</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>7</th>\n",
       "      <td>2000</td>\n",
       "      <td>2004</td>\n",
       "      <td>209000</td>\n",
       "      <td>4</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>8</th>\n",
       "      <td>3000</td>\n",
       "      <td>2003</td>\n",
       "      <td>138000</td>\n",
       "      <td>2</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>9</th>\n",
       "      <td>1900</td>\n",
       "      <td>2003</td>\n",
       "      <td>160000</td>\n",
       "      <td>4</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>10</th>\n",
       "      <td>2500</td>\n",
       "      <td>2003</td>\n",
       "      <td>190000</td>\n",
       "      <td>2</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>11</th>\n",
       "      <td>5000</td>\n",
       "      <td>2001</td>\n",
       "      <td>62000</td>\n",
       "      <td>4</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>12</th>\n",
       "      <td>1800</td>\n",
       "      <td>1999</td>\n",
       "      <td>163000</td>\n",
       "      <td>2</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>13</th>\n",
       "      <td>1300</td>\n",
       "      <td>1997</td>\n",
       "      <td>138000</td>\n",
       "      <td>4</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "    price  year   miles  doors  vtype\n",
       "0   22000  2012   13000      2      0\n",
       "1   14000  2010   30000      2      0\n",
       "2   13000  2010   73500      4      0\n",
       "3    9500  2009   78000      4      0\n",
       "4    9000  2007   47000      4      0\n",
       "5    4000  2006  124000      2      0\n",
       "6    3000  2004  177000      4      0\n",
       "7    2000  2004  209000      4      1\n",
       "8    3000  2003  138000      2      0\n",
       "9    1900  2003  160000      4      0\n",
       "10   2500  2003  190000      2      1\n",
       "11   5000  2001   62000      4      0\n",
       "12   1800  1999  163000      2      1\n",
       "13   1300  1997  138000      4      0"
      ]
     },
     "execution_count": 6,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# read in and prepare the vehicle training data\n",
    "import pandas as pd\n",
    "url = 'https://raw.githubusercontent.com/justmarkham/DAT8/master/data/vehicles_train.csv'\n",
    "train = pd.read_csv(url)\n",
    "train['vtype'] = train.vtype.map({'car':0, 'truck':1})\n",
    "train"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[array([13,  2, 12,  2,  6,  1,  3, 10, 11,  9,  6,  1,  0,  1]),\n",
       " array([ 9,  0,  0,  9,  3, 13,  4,  0,  0,  4,  1,  7,  3,  2]),\n",
       " array([ 4,  7,  2,  4,  8, 13,  0,  7,  9,  3, 12, 12,  4,  6]),\n",
       " array([ 1,  5,  6, 11,  2,  1, 12,  8,  3, 10,  5,  0, 11,  2]),\n",
       " array([10, 10,  6, 13,  2,  4, 11, 11, 13, 12,  4,  6, 13,  3]),\n",
       " array([10,  0,  6,  4,  7, 11,  6,  7,  1, 11, 10,  5,  7,  9]),\n",
       " array([ 2,  4,  8,  1, 12,  2,  1,  1,  3, 12,  5,  9,  0,  8]),\n",
       " array([11,  1,  6,  3,  3, 11,  5,  9,  7,  9,  2,  3, 11,  3]),\n",
       " array([ 3,  8,  6,  9,  7,  6,  3,  9,  6, 12,  6, 11,  6,  1]),\n",
       " array([13, 10,  3,  4,  3,  1, 13,  0,  5,  8, 13,  6, 11,  8])]"
      ]
     },
     "execution_count": 7,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# set a seed for reproducibility\n",
    "np.random.seed(123)\n",
    "\n",
    "# create ten bootstrap samples (will be used to select rows from the DataFrame)\n",
    "samples = [np.random.choice(a=14, size=14, replace=True) for _ in range(1, 11)]\n",
    "samples"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>price</th>\n",
       "      <th>year</th>\n",
       "      <th>miles</th>\n",
       "      <th>doors</th>\n",
       "      <th>vtype</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>13</th>\n",
       "      <td>1300</td>\n",
       "      <td>1997</td>\n",
       "      <td>138000</td>\n",
       "      <td>4</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>13000</td>\n",
       "      <td>2010</td>\n",
       "      <td>73500</td>\n",
       "      <td>4</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>12</th>\n",
       "      <td>1800</td>\n",
       "      <td>1999</td>\n",
       "      <td>163000</td>\n",
       "      <td>2</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>13000</td>\n",
       "      <td>2010</td>\n",
       "      <td>73500</td>\n",
       "      <td>4</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>6</th>\n",
       "      <td>3000</td>\n",
       "      <td>2004</td>\n",
       "      <td>177000</td>\n",
       "      <td>4</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>14000</td>\n",
       "      <td>2010</td>\n",
       "      <td>30000</td>\n",
       "      <td>2</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>9500</td>\n",
       "      <td>2009</td>\n",
       "      <td>78000</td>\n",
       "      <td>4</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>10</th>\n",
       "      <td>2500</td>\n",
       "      <td>2003</td>\n",
       "      <td>190000</td>\n",
       "      <td>2</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>11</th>\n",
       "      <td>5000</td>\n",
       "      <td>2001</td>\n",
       "      <td>62000</td>\n",
       "      <td>4</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>9</th>\n",
       "      <td>1900</td>\n",
       "      <td>2003</td>\n",
       "      <td>160000</td>\n",
       "      <td>4</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>6</th>\n",
       "      <td>3000</td>\n",
       "      <td>2004</td>\n",
       "      <td>177000</td>\n",
       "      <td>4</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>14000</td>\n",
       "      <td>2010</td>\n",
       "      <td>30000</td>\n",
       "      <td>2</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>22000</td>\n",
       "      <td>2012</td>\n",
       "      <td>13000</td>\n",
       "      <td>2</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>14000</td>\n",
       "      <td>2010</td>\n",
       "      <td>30000</td>\n",
       "      <td>2</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "    price  year   miles  doors  vtype\n",
       "13   1300  1997  138000      4      0\n",
       "2   13000  2010   73500      4      0\n",
       "12   1800  1999  163000      2      1\n",
       "2   13000  2010   73500      4      0\n",
       "6    3000  2004  177000      4      0\n",
       "1   14000  2010   30000      2      0\n",
       "3    9500  2009   78000      4      0\n",
       "10   2500  2003  190000      2      1\n",
       "11   5000  2001   62000      4      0\n",
       "9    1900  2003  160000      4      0\n",
       "6    3000  2004  177000      4      0\n",
       "1   14000  2010   30000      2      0\n",
       "0   22000  2012   13000      2      0\n",
       "1   14000  2010   30000      2      0"
      ]
     },
     "execution_count": 8,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# show the rows for the first decision tree\n",
    "train.iloc[samples[0], :]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>price</th>\n",
       "      <th>year</th>\n",
       "      <th>miles</th>\n",
       "      <th>doors</th>\n",
       "      <th>vtype</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>3000</td>\n",
       "      <td>2003</td>\n",
       "      <td>130000</td>\n",
       "      <td>4</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>6000</td>\n",
       "      <td>2005</td>\n",
       "      <td>82500</td>\n",
       "      <td>4</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>12000</td>\n",
       "      <td>2010</td>\n",
       "      <td>60000</td>\n",
       "      <td>2</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   price  year   miles  doors  vtype\n",
       "0   3000  2003  130000      4      1\n",
       "1   6000  2005   82500      4      0\n",
       "2  12000  2010   60000      2      0"
      ]
     },
     "execution_count": 9,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# read in and prepare the vehicle testing data\n",
    "url = 'https://raw.githubusercontent.com/justmarkham/DAT8/master/data/vehicles_test.csv'\n",
    "test = pd.read_csv(url)\n",
    "test['vtype'] = test.vtype.map({'car':0, 'truck':1})\n",
    "test"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[  1300.,   5000.,  14000.],\n",
       "       [  1300.,   1300.,  13000.],\n",
       "       [  2000.,   9000.,  13000.],\n",
       "       [  4000.,   5000.,  13000.],\n",
       "       [  1300.,   5000.,  13000.],\n",
       "       [  4000.,   5000.,  14000.],\n",
       "       [  4000.,   9000.,  14000.],\n",
       "       [  4000.,   5000.,  13000.],\n",
       "       [  3000.,   5000.,   9500.],\n",
       "       [  4000.,   5000.,   9000.]])"
      ]
     },
     "execution_count": 10,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "from sklearn.tree import DecisionTreeRegressor\n",
    "\n",
    "# grow each tree deep\n",
    "treereg = DecisionTreeRegressor(max_depth=None, random_state=123)\n",
    "\n",
    "# list for storing predicted price from each tree\n",
    "predictions = []\n",
    "\n",
    "# define testing data\n",
    "X_test = test.iloc[:, 1:]\n",
    "y_test = test.iloc[:, 0]\n",
    "\n",
    "# grow one tree for each bootstrap sample and make predictions on testing data\n",
    "for sample in samples:\n",
    "    X_train = train.iloc[sample, 1:]\n",
    "    y_train = train.iloc[sample, 0]\n",
    "    treereg.fit(X_train, y_train)\n",
    "    y_pred = treereg.predict(X_test)\n",
    "    predictions.append(y_pred)\n",
    "\n",
    "# convert predictions from list to NumPy array\n",
    "predictions = np.array(predictions)\n",
    "predictions"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([  2890.,   5430.,  12550.])"
      ]
     },
     "execution_count": 11,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# average predictions\n",
    "np.mean(predictions, axis=0)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "461.6997581401431"
      ]
     },
     "execution_count": 12,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# calculate RMSE\n",
    "from sklearn import metrics\n",
    "y_pred = np.mean(predictions, axis=0)\n",
    "np.sqrt(metrics.mean_squared_error(y_test, y_pred))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Bagged decision trees in scikit-learn (with B=500)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "# define the training and testing sets\n",
    "X_train = train.iloc[:, 1:]\n",
    "y_train = train.iloc[:, 0]\n",
    "X_test = test.iloc[:, 1:]\n",
    "y_test = test.iloc[:, 0]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "# instruct BaggingRegressor to use DecisionTreeRegressor as the \"base estimator\"\n",
    "from sklearn.ensemble import BaggingRegressor\n",
    "bagreg = BaggingRegressor(DecisionTreeRegressor(), n_estimators=500, bootstrap=True, oob_score=True, random_state=1)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([  3351.2,   5384.4,  12971. ])"
      ]
     },
     "execution_count": 15,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# fit and predict\n",
    "bagreg.fit(X_train, y_train)\n",
    "y_pred = bagreg.predict(X_test)\n",
    "y_pred"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "694.05710619996307"
      ]
     },
     "execution_count": 16,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# calculate RMSE\n",
    "np.sqrt(metrics.mean_squared_error(y_test, y_pred))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Estimating out-of-sample error\n",
    "\n",
    "For bagged models, out-of-sample error can be estimated without using **train/test split** or **cross-validation**!\n",
    "\n",
    "On average, each bagged tree uses about **two-thirds** of the observations. For each tree, the **remaining observations** are called \"out-of-bag\" observations."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([13,  2, 12,  2,  6,  1,  3, 10, 11,  9,  6,  1,  0,  1])"
      ]
     },
     "execution_count": 17,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# show the first bootstrap sample\n",
    "samples[0]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "set([0, 1, 2, 3, 6, 9, 10, 11, 12, 13])\n",
      "set([0, 1, 2, 3, 4, 7, 9, 13])\n",
      "set([0, 2, 3, 4, 6, 7, 8, 9, 12, 13])\n",
      "set([0, 1, 2, 3, 5, 6, 8, 10, 11, 12])\n",
      "set([2, 3, 4, 6, 10, 11, 12, 13])\n",
      "set([0, 1, 4, 5, 6, 7, 9, 10, 11])\n",
      "set([0, 1, 2, 3, 4, 5, 8, 9, 12])\n",
      "set([1, 2, 3, 5, 6, 7, 9, 11])\n",
      "set([1, 3, 6, 7, 8, 9, 11, 12])\n",
      "set([0, 1, 3, 4, 5, 6, 8, 10, 11, 13])\n"
     ]
    }
   ],
   "source": [
    "# show the \"in-bag\" observations for each sample\n",
    "for sample in samples:\n",
    "    print set(sample)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[4, 5, 7, 8]\n",
      "[5, 6, 8, 10, 11, 12]\n",
      "[1, 5, 10, 11]\n",
      "[4, 7, 9, 13]\n",
      "[0, 1, 5, 7, 8, 9]\n",
      "[2, 3, 8, 12, 13]\n",
      "[6, 7, 10, 11, 13]\n",
      "[0, 4, 8, 10, 12, 13]\n",
      "[0, 2, 4, 5, 10, 13]\n",
      "[2, 7, 9, 12]\n"
     ]
    }
   ],
   "source": [
    "# show the \"out-of-bag\" observations for each sample\n",
    "for sample in samples:\n",
    "    print sorted(set(range(14)) - set(sample))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "How to calculate **\"out-of-bag error\":**\n",
    "\n",
    "1. For every observation in the training data, predict its response value using **only** the trees in which that observation was out-of-bag. Average those predictions (for regression) or take a vote (for classification).\n",
    "2. Compare all predictions to the actual response values in order to compute the out-of-bag error.\n",
    "\n",
    "When B is sufficiently large, the **out-of-bag error** is an accurate estimate of **out-of-sample error**."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 20,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "0.7661434140978729"
      ]
     },
     "execution_count": 20,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# compute the out-of-bag R-squared score (not MSE, unfortunately!) for B=500\n",
    "bagreg.oob_score_"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Estimating feature importance\n",
    "\n",
    "Bagging increases **predictive accuracy**, but decreases **model interpretability** because it's no longer possible to visualize the tree to understand the importance of each feature.\n",
    "\n",
    "However, we can still obtain an overall summary of **feature importance** from bagged models:\n",
    "\n",
    "- **Bagged regression trees:** calculate the total amount that **MSE** is decreased due to splits over a given feature, averaged over all trees\n",
    "- **Bagged classification trees:** calculate the total amount that **Gini index** is decreased due to splits over a given feature, averaged over all trees"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Part 4: Random Forests\n",
    "\n",
    "Random Forests is a **slight variation of bagged trees** that has even better performance:\n",
    "\n",
    "- Exactly like bagging, we create an ensemble of decision trees using bootstrapped samples of the training set.\n",
    "- However, when building each tree, each time a split is considered, a **random sample of m features** is chosen as split candidates from the **full set of p features**. The split is only allowed to use **one of those m features**.\n",
    "    - A new random sample of features is chosen for **every single tree at every single split**.\n",
    "    - For **classification**, m is typically chosen to be the square root of p.\n",
    "    - For **regression**, m is typically chosen to be somewhere between p/3 and p.\n",
    "\n",
    "What's the point?\n",
    "\n",
    "- Suppose there is **one very strong feature** in the data set. When using bagged trees, most of the trees will use that feature as the top split, resulting in an ensemble of similar trees that are **highly correlated**.\n",
    "- Averaging highly correlated quantities does not significantly reduce variance (which is the entire goal of bagging).\n",
    "- By randomly leaving out candidate features from each split, **Random Forests \"decorrelates\" the trees**, such that the averaging process can reduce the variance of the resulting model."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Part 5: Building and tuning decision trees and Random Forests\n",
    "\n",
    "- Major League Baseball player data from 1986-87: [data](https://github.com/justmarkham/DAT8/blob/master/data/hitters.csv), [data dictionary](https://cran.r-project.org/web/packages/ISLR/ISLR.pdf) (page 7)\n",
    "- Each observation represents a player\n",
    "- **Goal:** Predict player salary"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Preparing the data"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 21,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "# read in the data\n",
    "url = 'https://raw.githubusercontent.com/justmarkham/DAT8/master/data/hitters.csv'\n",
    "hitters = pd.read_csv(url)\n",
    "\n",
    "# remove rows with missing values\n",
    "hitters.dropna(inplace=True)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 22,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>AtBat</th>\n",
       "      <th>Hits</th>\n",
       "      <th>HmRun</th>\n",
       "      <th>Runs</th>\n",
       "      <th>RBI</th>\n",
       "      <th>Walks</th>\n",
       "      <th>Years</th>\n",
       "      <th>CAtBat</th>\n",
       "      <th>CHits</th>\n",
       "      <th>CHmRun</th>\n",
       "      <th>CRuns</th>\n",
       "      <th>CRBI</th>\n",
       "      <th>CWalks</th>\n",
       "      <th>League</th>\n",
       "      <th>Division</th>\n",
       "      <th>PutOuts</th>\n",
       "      <th>Assists</th>\n",
       "      <th>Errors</th>\n",
       "      <th>Salary</th>\n",
       "      <th>NewLeague</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>315</td>\n",
       "      <td>81</td>\n",
       "      <td>7</td>\n",
       "      <td>24</td>\n",
       "      <td>38</td>\n",
       "      <td>39</td>\n",
       "      <td>14</td>\n",
       "      <td>3449</td>\n",
       "      <td>835</td>\n",
       "      <td>69</td>\n",
       "      <td>321</td>\n",
       "      <td>414</td>\n",
       "      <td>375</td>\n",
       "      <td>N</td>\n",
       "      <td>W</td>\n",
       "      <td>632</td>\n",
       "      <td>43</td>\n",
       "      <td>10</td>\n",
       "      <td>475.0</td>\n",
       "      <td>N</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>479</td>\n",
       "      <td>130</td>\n",
       "      <td>18</td>\n",
       "      <td>66</td>\n",
       "      <td>72</td>\n",
       "      <td>76</td>\n",
       "      <td>3</td>\n",
       "      <td>1624</td>\n",
       "      <td>457</td>\n",
       "      <td>63</td>\n",
       "      <td>224</td>\n",
       "      <td>266</td>\n",
       "      <td>263</td>\n",
       "      <td>A</td>\n",
       "      <td>W</td>\n",
       "      <td>880</td>\n",
       "      <td>82</td>\n",
       "      <td>14</td>\n",
       "      <td>480.0</td>\n",
       "      <td>A</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>496</td>\n",
       "      <td>141</td>\n",
       "      <td>20</td>\n",
       "      <td>65</td>\n",
       "      <td>78</td>\n",
       "      <td>37</td>\n",
       "      <td>11</td>\n",
       "      <td>5628</td>\n",
       "      <td>1575</td>\n",
       "      <td>225</td>\n",
       "      <td>828</td>\n",
       "      <td>838</td>\n",
       "      <td>354</td>\n",
       "      <td>N</td>\n",
       "      <td>E</td>\n",
       "      <td>200</td>\n",
       "      <td>11</td>\n",
       "      <td>3</td>\n",
       "      <td>500.0</td>\n",
       "      <td>N</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>321</td>\n",
       "      <td>87</td>\n",
       "      <td>10</td>\n",
       "      <td>39</td>\n",
       "      <td>42</td>\n",
       "      <td>30</td>\n",
       "      <td>2</td>\n",
       "      <td>396</td>\n",
       "      <td>101</td>\n",
       "      <td>12</td>\n",
       "      <td>48</td>\n",
       "      <td>46</td>\n",
       "      <td>33</td>\n",
       "      <td>N</td>\n",
       "      <td>E</td>\n",
       "      <td>805</td>\n",
       "      <td>40</td>\n",
       "      <td>4</td>\n",
       "      <td>91.5</td>\n",
       "      <td>N</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>594</td>\n",
       "      <td>169</td>\n",
       "      <td>4</td>\n",
       "      <td>74</td>\n",
       "      <td>51</td>\n",
       "      <td>35</td>\n",
       "      <td>11</td>\n",
       "      <td>4408</td>\n",
       "      <td>1133</td>\n",
       "      <td>19</td>\n",
       "      <td>501</td>\n",
       "      <td>336</td>\n",
       "      <td>194</td>\n",
       "      <td>A</td>\n",
       "      <td>W</td>\n",
       "      <td>282</td>\n",
       "      <td>421</td>\n",
       "      <td>25</td>\n",
       "      <td>750.0</td>\n",
       "      <td>A</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   AtBat  Hits  HmRun  Runs  RBI  Walks  Years  CAtBat  CHits  CHmRun  CRuns  \\\n",
       "1    315    81      7    24   38     39     14    3449    835      69    321   \n",
       "2    479   130     18    66   72     76      3    1624    457      63    224   \n",
       "3    496   141     20    65   78     37     11    5628   1575     225    828   \n",
       "4    321    87     10    39   42     30      2     396    101      12     48   \n",
       "5    594   169      4    74   51     35     11    4408   1133      19    501   \n",
       "\n",
       "   CRBI  CWalks League Division  PutOuts  Assists  Errors  Salary NewLeague  \n",
       "1   414     375      N        W      632       43      10   475.0         N  \n",
       "2   266     263      A        W      880       82      14   480.0         A  \n",
       "3   838     354      N        E      200       11       3   500.0         N  \n",
       "4    46      33      N        E      805       40       4    91.5         N  \n",
       "5   336     194      A        W      282      421      25   750.0         A  "
      ]
     },
     "execution_count": 22,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "hitters.head()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 23,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>AtBat</th>\n",
       "      <th>Hits</th>\n",
       "      <th>HmRun</th>\n",
       "      <th>Runs</th>\n",
       "      <th>RBI</th>\n",
       "      <th>Walks</th>\n",
       "      <th>Years</th>\n",
       "      <th>CAtBat</th>\n",
       "      <th>CHits</th>\n",
       "      <th>CHmRun</th>\n",
       "      <th>CRuns</th>\n",
       "      <th>CRBI</th>\n",
       "      <th>CWalks</th>\n",
       "      <th>League</th>\n",
       "      <th>Division</th>\n",
       "      <th>PutOuts</th>\n",
       "      <th>Assists</th>\n",
       "      <th>Errors</th>\n",
       "      <th>Salary</th>\n",
       "      <th>NewLeague</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>315</td>\n",
       "      <td>81</td>\n",
       "      <td>7</td>\n",
       "      <td>24</td>\n",
       "      <td>38</td>\n",
       "      <td>39</td>\n",
       "      <td>14</td>\n",
       "      <td>3449</td>\n",
       "      <td>835</td>\n",
       "      <td>69</td>\n",
       "      <td>321</td>\n",
       "      <td>414</td>\n",
       "      <td>375</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>632</td>\n",
       "      <td>43</td>\n",
       "      <td>10</td>\n",
       "      <td>475.0</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>479</td>\n",
       "      <td>130</td>\n",
       "      <td>18</td>\n",
       "      <td>66</td>\n",
       "      <td>72</td>\n",
       "      <td>76</td>\n",
       "      <td>3</td>\n",
       "      <td>1624</td>\n",
       "      <td>457</td>\n",
       "      <td>63</td>\n",
       "      <td>224</td>\n",
       "      <td>266</td>\n",
       "      <td>263</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>880</td>\n",
       "      <td>82</td>\n",
       "      <td>14</td>\n",
       "      <td>480.0</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>496</td>\n",
       "      <td>141</td>\n",
       "      <td>20</td>\n",
       "      <td>65</td>\n",
       "      <td>78</td>\n",
       "      <td>37</td>\n",
       "      <td>11</td>\n",
       "      <td>5628</td>\n",
       "      <td>1575</td>\n",
       "      <td>225</td>\n",
       "      <td>828</td>\n",
       "      <td>838</td>\n",
       "      <td>354</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>200</td>\n",
       "      <td>11</td>\n",
       "      <td>3</td>\n",
       "      <td>500.0</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>321</td>\n",
       "      <td>87</td>\n",
       "      <td>10</td>\n",
       "      <td>39</td>\n",
       "      <td>42</td>\n",
       "      <td>30</td>\n",
       "      <td>2</td>\n",
       "      <td>396</td>\n",
       "      <td>101</td>\n",
       "      <td>12</td>\n",
       "      <td>48</td>\n",
       "      <td>46</td>\n",
       "      <td>33</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>805</td>\n",
       "      <td>40</td>\n",
       "      <td>4</td>\n",
       "      <td>91.5</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>594</td>\n",
       "      <td>169</td>\n",
       "      <td>4</td>\n",
       "      <td>74</td>\n",
       "      <td>51</td>\n",
       "      <td>35</td>\n",
       "      <td>11</td>\n",
       "      <td>4408</td>\n",
       "      <td>1133</td>\n",
       "      <td>19</td>\n",
       "      <td>501</td>\n",
       "      <td>336</td>\n",
       "      <td>194</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>282</td>\n",
       "      <td>421</td>\n",
       "      <td>25</td>\n",
       "      <td>750.0</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   AtBat  Hits  HmRun  Runs  RBI  Walks  Years  CAtBat  CHits  CHmRun  CRuns  \\\n",
       "1    315    81      7    24   38     39     14    3449    835      69    321   \n",
       "2    479   130     18    66   72     76      3    1624    457      63    224   \n",
       "3    496   141     20    65   78     37     11    5628   1575     225    828   \n",
       "4    321    87     10    39   42     30      2     396    101      12     48   \n",
       "5    594   169      4    74   51     35     11    4408   1133      19    501   \n",
       "\n",
       "   CRBI  CWalks  League  Division  PutOuts  Assists  Errors  Salary  NewLeague  \n",
       "1   414     375       0         0      632       43      10   475.0          0  \n",
       "2   266     263       1         0      880       82      14   480.0          1  \n",
       "3   838     354       0         1      200       11       3   500.0          0  \n",
       "4    46      33       0         1      805       40       4    91.5          0  \n",
       "5   336     194       1         0      282      421      25   750.0          1  "
      ]
     },
     "execution_count": 23,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# encode categorical variables as integers\n",
    "hitters['League'] = pd.factorize(hitters.League)[0]\n",
    "hitters['Division'] = pd.factorize(hitters.Division)[0]\n",
    "hitters['NewLeague'] = pd.factorize(hitters.NewLeague)[0]\n",
    "hitters.head()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 24,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "# allow plots to appear in the notebook\n",
    "%matplotlib inline\n",
    "import matplotlib.pyplot as plt"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 25,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "<matplotlib.axes._subplots.AxesSubplot at 0x1805e828>"
      ]
     },
     "execution_count": 25,
     "metadata": {},
     "output_type": "execute_result"
    },
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAAAYUAAAD3CAYAAADyvkg2AAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAIABJREFUeJzsnXl4jFcXwH83mUySyWYJkhAiJNbYYm8Ra+2UolrUVrV8\nqOqCLmiVLmiLVrW0qKJqq30pYmkRu6BU7LElEoIsk1nu98dMKswkEpM97+953sc7d85777mDe957\n77nnCCklCgoKCgoKAHa5rYCCgoKCQt5BMQoKCgoKCv+hGAUFBQUFhf9QjIKCgoKCwn8oRkFBQUFB\n4T8Uo6CgoKCg8B/ZZhSEEL5CiF1CiNNCiFNCiFHm8klCiEghxDHz1S7VM+OFEOeFEGeFEG2ySzcF\nBQUFBeuI7DqnIITwAryklMeFEK7AEaAr0BN4IKWc+YR8VWApUA8oDfwJBEopjdmioIKCgoKCBdk2\nU5BS3pJSHjffPwT+wTTYAwgrj3QBlkkpdVLKy0AEUD+79FNQUFBQsCRH9hSEEH5AbeCAuWikEOKE\nEGKBEKKIucwHiEz1WCSPjIiCgoKCQg6Q7UbBvHS0EhhtnjHMBcoDtYCbwIx0HldicCgoKCjkIKrs\nrFwI4QCsApZIKdcCSCmjUn0/H1hv/ngd8E31eBlz2ZN1KoZCQUEhw0gprS1XZ4hnGW9saS8vkG1G\nQQghgAXAGSnl16nKvaWUN80fXwTCzffrgKVCiJmYlo0CgDBrdRe0IH4hISGEhobmthpZjtKv/ENB\n7BOAaRiyjSmZkP3A5tZyn+ycKTwH9AFOCiGOmcsmAL2FELUwLQ1dAt4AkFKeEUKsAM4AemC4LGij\nfxr4+fnltgrZgtKv/ENB7FNW4ZDbCuQw2WYUpJT7sL5nsTmdZ6YCU7NLp7xKQf0PqfQr/1AQ+5RV\nZOsaex6ksPU3TxISEpLbKmQLSr/yDwWxT1mFc24rkMNk2+G17EIIUVhWlRQUFGxECGHzRvOPmZB/\nHWWjWUFBQaFAU9gGycLW3zyFlJLt27dz5coV6tSpQ3BwcG6rpKCg8ATKRrNCjiCl5I0B/fhr6xoa\nlJRMvAqTpk1nyNBhua2agoJCKgrbIKnsKeQSBw4c4NVOrQjvEo/GAS7EQa21aqJj43Bycspt9RQU\nCgRZsaewKhPy3cn/ewpKPoVc4tatW1Qpbo/GPDet4AHODnbcvXs3dxVTUFB4DIdMXAUBxSjkEsHB\nwRy8qWf3DTBK+O60oGgxT0qVKpXbqikoKKTCORNXQaCwLZflGXx9fVmyYjW9+7xMVGwc1QMrsG7L\nBuzsFDutoJCXKGyDpLKnkAdITk5GrVbnthoKCgWOrNhT2JcJ+efJ/3sKhc0I5kkUg6CgkHcpbIOk\nslahoKCgkA62bDSnk6v+SyHEP+ZkY6uFEB7mcj8hRGKqHPbfpaorWAgRbs5j/0129VcxCgoKCgrp\noMrEZQUdMEZKWQ1oCIwQQlQBtgHVpJQ1gX+B8ameiZBS1jZfw1OVzwUGSSkDgAAhRNss62QqFKOg\noKCgkA62zBTSyFXvI6XcLqU0msUOYkoqliZCCG/ATUqZkmNmMdDVln6lRWFbLlNQUFDIFFnlapoq\nV/3BJ74aCCxL9bm8OQdNHPCBOQ1BaR7PYX+dbMphrxgFBQUFhXRI71Da38D+DNRhJVd9Svn7QLKU\ncqm56AbgK6W8K4SoA6wVQlR7RtWfCcUoZAKj0YgQIktS/CkoKOQP0hskm5qvFL6yImMtV725vD/Q\nHmiZUialTAaSzfdHhRAXMKUmvs7jS0xWc9hnBcqeQgbQ6/UMHDgcR0cXHB1dePPN9zAajU9/UEFB\nId/joMr49STp5KpvC7wDdJFSJqUq9xRC2Jvv/TEZhIvmvPb3hRANzHX2BdaSDShGIQNMnjyN3347\nh15/A53uEj/+GMqcOXNzWy0FBYUcQKXK+GWFlFz1zVO5mbYDZgOuwPYnXE+bASfMewq/A29IKe+Z\nvxsOzAfOY/JQ2pId/VVONGeAunVbcuTIe0Abc8ly2rRZxdatv+eoHgoKCpkjK040J7hkXF4Tn/9P\nNCszhQzg41MSO7vj/31WqY5TpkzJXNRIQUEhp7BxppDvUGYKGSAiIoJ69ZqSnNwMIZJxcTnCsWN/\n4+Pjk6N6KCgoZI6smCnITLz/iaj8P1NQjEIGuX37Nhs2bMDe3p7OnTtTrFixHNdBQUEhc2SJUcjE\nu5+4oRiFHKcgRklVUFDIHrLEKJTNhPzV/G8UCsgqmIKCgkI2UchGSWWjORuQUhIbG4tOp8ttVRQU\nFGzFPhNXAUAxCllMREQE5ctXx9u7PG5uxVi4cHFuq6SgoGALNoZJzW8oewpZTGBgbSIi+iPlaOAs\nzs4h7N+/lZo1a1qVP3XqFFevXqV69eqULZuJxUsFBYWnkiV7CrUyIX88/+8pKDOFLESr1XLhwimk\nHGUuqYyd3QscOXLEqvz4DydRP6QNr3zwDVVqBrN69Zo063748CHLly9n4cKFXL+eLSFPFBQUrFHI\nZgoFpBt5A7VajYtLUR48OIgpn0YiQhyhTJlXLWSPHz/OrHnzSfzwJIlunnDlKH0GtORexw4W6Tnv\n3r1LcHBDoqPVSOmEnd1Y9u3bRY0aNXKmYwoKhZlCNkoqM4UsRAjBr78uQKPphJtbd1xcatG+fT1a\nt25tIXvp0iVUfnXAzdNUUK4O2Ku5c+eOheyXX87g+vWiPHzYl/j4Hjx40Izhw8dkd3cUFBSg0G00\nFzIbmP106tSJ8PCDHD58GC+v0TRp0sRqqO3q1auT/O/fOP3YBbU2ikS1F05qFaVKlbKQjYy8QXKy\nd6qSMty48U829kJBQeE/CtkoWci6mzP4+/vj7++froyPjw+e7tC+3iZavGDH998YsA9qiL295etG\nmzYtWL16AvHx1QAnnJz20qpV82zSXkFB4TEK2ShZyLqbd9izZw++5fR8NsuU16lpSzsqlTxEbGys\nRQiNV199lbNn/+WLL77AaDTSpk1nvvlmRm6oraBQ+Chko6Syp5BLmFzlTAfdAPM9VpeahBBMmfIx\nSUkJJCUl8McfK3F2zqrMsQoKCunimInrCYQQvkKIXUKI00KIU0KIUebyYkKI7UKIf4UQ24QQRVI9\nM14IcV4IcVYI0SZVebAQItz83TfZ1V3FKOQSTZs25f69YowfLdiwxsCAl1R06NiOokWLpvmMnZ0d\nqoISn1dBIb9gm0uqDhgjpayGySVxhBCiCjAO2C6lDAR2mD8jhKgK9AKqAm2B78SjN8W5wCApZQAQ\nYM7eluVkm1HISgtZENFoNOwODUNlfIUVPzeicYM3WbxoRW6rpaCg8CQ2eB9JKW9JKY+b7x8C/wCl\ngc7AIrPYIqCr+b4LsExKqZNSXgYigAZCCG/ATUoZZpZbnOqZLCU7XztTLORxIYQrcEQIsR0YgMlC\nfiGEeA+ThRz3hIUsDfwphAiUUua7ZMj79u3jr7/+wsvLi1deeQUHBwercp6ennw7Z34Oa1fwiYmJ\nYfny5SQlJdG5c2cCAgJyWyWF/EwWjZJCCD+gNnAQKCWlvG3+6jaQ4nboAxxI9VgkpvFQZ75P4bq5\nPMvJNqMgpbwF3DLfPxRCpLaQzcxii4BQTIbhPwsJXBZCRAD1efwHyvN8//0PjB07keTkdjg6rmHe\nvMXs2bNVWfbJIW7dukW9mjUpc/8+zgYDUyZOZMuOHTRo0CC3VVPIr2TBf13zi/EqYLSU8kHqvUMp\npRRC5JnYPTkyUtloIfMNUkrGjBlLUtIKwA+93kh4eB82btxIly5dLORjYmL46JNJXLhykefqNWLc\nO++lOatQyBgzp08nMDaWl/R6AHx1Ot4dPZrdB/LVu4VCXiKdQ2mhNyH0VvqPCyEcMBmEX6SUa83F\nt4UQXlLKW+aloShz+XXAN9XjZTCNhdfN96nLsyXeTbYbBRstpNXvJk2a9N99SEgIISEhWaKrrSQn\nJ5OcrOXR36kdUpYjNjbWQjYxMZHGzZugauKN56sB/DR/JeFnTrFiyfIc1bmgERMVRQmzQQDTG8dR\nK6fEFQomoaGhhIaGZm2l6YySIb6mK4XJJx7/3rxJvAA4I6X8OtVX64DXgM/Nf65NVb5UCDET00tx\nABBmHivvCyEaAGFAX2CWDb1Kk2w1CllgIa1awtRGIS/h6OhIcHBjjh2bgV4/BDiNlPto0mS6hezu\n3btJdJM0m9MLIQSl21dnTan3uHv3broeSArp06FrV/63ahUBCQlogM3OzrS3MktTKJg8+ZI4efJk\n2yt1sunp54A+wEkhxDFz2XjgM2CFEGIQcBnoCSClPCOEWAGcAfTA8FRhoYcDCwFnYJOUcotNmqVB\nthmFrLKQ2aVfdrFhwwp69uzPgQNtKF7ci0WLllOxYkULOSkldir7/84lCHs7EI/OLSg8G926dePa\n1at8OmkSyTodvV95hSmffZbbainkZ2yIaSSl3EfaXp6t0nhmKjDVSvkRIOjZtckY2ZZPQQjxPLAH\nOMmjZaDxmAb6FUBZzBZSSnnP/MwEYCAmCzlaSrnVSr15Op9CRnn48CFBwTVx61qJEs0DuPLD31TG\nmw2r1+W2agoKBYYsyacwNhPyM/J/PgUlyU4ucvPmTd79YBwXr16mcd0GfDLxY5ycbJurKigoPCJL\njMJ7mZD/XDEKOU5BMgoKCgrZS5YYhQmZkJ+a/42C4jyvoKCgkB6FbJQsZN1VUFBQyCSFbJQsZN1V\nUFBQyCRWop8WZBSjoKCgoJAehWyUVEJn5xO+nPklRUsUxcVNw4AhA9BqtbmtkoJC4cC20Nn5DsUo\n5DLR0dGcOHGCBw8epCmzcuVKvpo3g0F/d+GtS/04FPkX4z8cn4NaKigUYmwInZ0fUYxCLjLn2+8p\n4xdAoza98Clbgd27d1uV2/znZuqNqopnQFFcPJ1pNjmYLds357C2CgqFlEI2Uygg3ch/nD17ljHv\nTkD/8jHwKA9XttO+czfiYm5bhNkuWbwEZ08f++9z1JlYSnh65rTKCgqFk0I2Shay7uYdNm3ahL5Y\nTZNBACjXmoQkPZGRkfj5+T0m+9abY1nWaBkremzDqaiaf9ZcYtum7TmvtIJCYaSALAtlFMUo5BJ2\ndnZw5wQ8vA6upeH6PjAmkzq0eAolSpTg2KHjrFy5kqSkJDqM64C/v38uaK2gUAgpZJFnlDAXuURE\nRASVa9TBYLADB1fQPcBFoyLuThT29oXs1URBIZvIkjAXv2dCvkf+D3OhbDTnEt7e3hRxKwJ+/SB4\nHhRrQM0addI0CGvXrqVqcCMqVK/Dp599gdGY71JXKyjkT2zwPhJC/CSEuC2ECE9VtlwIccx8XUrJ\nsyCE8BNCJKb67rtUzwQLIcKFEOeFEN9kX2eV5aNcY8+ePSQ7+UGwOXmSVysOrytBbGwsxYoVe0x2\n165d9HztdXQDFoFLMSZ9+zpIyfvjMxG+UUFB4dmwbZT8GZgNLE4pkFK+nHIvhJgO3EslHyGlrG2l\nnrnAICllmBBikxCibXYl2VFmCrmEEAKkEVKWwqQRibS6p/DpFzPQtZsANdpDhYbo+3zPjDlzc1hj\nBYVCig0uqVLKvcBda9WaE5H1BJal17w5Q6WblDIl6dhioGvmO5IxlJlCLtG0aVM8He+RdHI0umIh\nOEcuoE37jlZTcd68eQ3sbj8qeHiHhMSEHNS24HL//n2OHz+Ou7s7NWvWtGqUFQo52bfF1wS4LaW8\nkKqsvHk5KQ74wJy5rTQQmUrmurksW1CMQi6h0WgI2x/K+Pcnc/7iIpr2a8gH71tfDmpYL5gzi2eB\n0QhuJWDDFMqVK5HDGhc8zp49S9M2rRClPdHeukPT+o1Ys3S5stGv8DjZ533UG1ia6vMNwFdKeVcI\nUQdYK4Solm2tp0GBMwpSSjZt2kR4eDgBAQF069Ytz779eXp68uO82U+V8ypVGkrVhjgjxEZCwIsU\ndY986nMK6dN36Os4jetN8eHdMWqT2d96DIsXL2bAgAG5rZpCXiKdd4TQoxB6LO3v00IIoQJeBOqk\nlEkpk4Fk8/1RIcQFTLnqrwNlUj1exlyWLRQ4ozB27Hv88MMKtNraODr+xNq1m1i8eH6eNQwZ4cr1\n21C9D9QdZiq4eYTbewfmrlJ5mEuXLvHD99+RlJRAz5f70KhRI6tyF89H4NXRlIDXzlGNqnUw587/\nm5OqKuQH0hklQ+qbrhQm/5zhWlsB/0gpb6QUCCE8gbtSSoMQwh+TQbgopbwnhLgvhGiAKcd9X2BW\n5jqRcQrURvOtW7f47rvviY+fjF7fl/j4SaxevYEzZ87ktmo24V/WB04uBF2CaWP62Hx8vEvltlo2\nYzAY+GLmDNp068qg4cO4efOmzXVeuHCBRvVrYfh3JiVjvqNrx5Zs2WLdSSOoZg3iFm5CSokh7iHJ\na/ZRq0ZNm3VQKGDYsNEshFgG/A0ECiGuCSFSpqG9sNxgbgqcMO8p/A68IaVM8UwaDswHzmPyUMoW\nzyMoYDOFu3fv4uDggVbrZi5xwsHB5OaZnzl/IQIRewYxyxuhdsKoTeSKT0mrskajkSmff87yP9bi\n4ebOlxMn8vzzz1uVjYmJYcTIdzh6PJyqVQKZO2c63t7e2dmVx3j9fyP47eh+NG8P4OD+E2xo2IBz\nJ05SpEiRZ67zuzlfM6jZQz591XSOo1LpRKZ9PJ62bdtayP4ybz4h7V7g8s+bSb53n379+tGrV69n\nbluhgGLDKCml7J1GucUapZRyNbA6DfkjQNCza5JxCtRMoUKFCri5qRBiE/AQ2IWd3R1q1KiR47oY\nDIanyhw7dozqtRtRxNOHlm27cPv2batyJ0+fpPGEBgw+N4T+h1/lxVVdiXt4x6rshIkTmf7Haq5O\nHc+JPl1p+2JXTp48aVW/kJYdWL1fw3nX2Ww8UZbGz7ciKSnJar1bt24lsEZlSpbxos/AvsTHxz+1\nf+mh1+tZuGABxbf+iEuPthSd+R73y3uzfPlym+pNiH9ASY9HB/tKeUBiGp5avr6+nDt+kkPbdnLx\nn3N8/83sPLvMGBsbS8fuvSjqVZpKtery999/57ZKhQcldHb+Ra1Ws3v3NoKCwnF0HE5g4B527dqK\nh4dHjukQGRlJ7dqNcXBwpEiRkqxZs8aqXHR0NM1bt+d0sWHEtT/InjuVaflCZ6yF8Kjo749Q2eFW\nxoMiFYph76TCw9XNSq2wcOmvyHnTUT3fEHXv7hgG9Ob3lSst5CIiIrh05Ra6CrPBoyF6v6nExKs5\nfvy4hWx4eDg9+75MxS9CaLVvCIcTzjJg6KBM/jKPo9VqkUYjpPb0UautGrDM8FKvvnyxXsPWY3Do\nPIxepKHHy6+lKa9SqQgICMDLy8umdrObzj16sz2hGPcm7+ff1uN4oVNXrl69mttqFQ6U0Nn5m4CA\nAE6cOJhr7bdv350zZ6oj5TTi4s7Rp8/rHD5cmSpVqjwmt3//fmTxWlCpHwD6ep8RscSTqKgoSpV6\nfL/gw/EfENKmOZoSLjgVc2bvmK1MmzDVavsOajXy/sP/Ptvdf4iju+X+g1qtxmhIApkMwhGkAaPu\nIWq12kJ227ZtlO9dC9+2pj40+PZFVvpPy9wP8wSOjo7YOToS1WU0Hu/0Q3vwFMkHThDcq79N9bZs\n2ZJvvl3Ih1M/RKvV0vu1gYx9Z5xNdeY2SUlJHNgbiuHXjWCvghJl4dAqQkND6devX26rV/BRcjQr\nPCtarZbTp49iNE7HNAmrhhAN2L9/v4VRcHd3x/ggEox6sFNBYjRGXRKurq4W9datW5ct6zYzbeZn\nPNTeZ+bHM+nfz/rb78S33+HNvsPQvj0Cce06Tuu20D/skIWcn58fzUOasOtwZxKL9MT5wQZqVitP\nzZqWG63u7u4kXH50Ev/B5Vhc3S31zAwqlYqRQ//Hd0uWET1qDkJvwKdEqSxZ03+pRw9e6tHD5nry\nCg4ODtjZ22OIuQ4ly5nOq0Rfwd3dPbdVKxwUslGykHX32blx4wbr1q3D3t6erl27UqKE5eExtVqN\nk5MLCQk7MIUzKQJcsHjzB2jSpAl1qvpyeFtbEjyb4HL1N0aOfRsXFxer7QcFBdG9YzeSkpIIados\nTT0HDxpEyRIlWP7HHxRxdeXdv/dTpkwZCzkhBGtXLeWrr2dx6MhealSrxzvvvGX14Fbv3r2ZMfsr\nQnv+gmsVTy7OP8xX02akqUNG+Wr657g4qdm0bTtlfHz4af4PVo1iZrl37x6rV68mKSmJ9u3bW+Sn\nyG/Y29sz5ZMpTP6kBQnP98X58mECPVS0b98+t1UrHBSyUVIJnZ0Bzp07R4MGTUlOroMQOjSasxw9\nuh9fX18L2SFDh/Hj4mVQvwdcPERR3V1uX43AwcHBQlan07Fw4UIuXbpCgwb16NKli9X279y5Q60G\njblbqjpGTVFUx9YTum0zwcHBWd7XtHjw4AELFiwgJjaG1q1a07RpU5vrXPTzz4wbO4JuVY0cu+WA\nZ0B91mzYZtOJ4jt37lC7cSMSq/tDUXeS14cSumUrderUefrDeZwtW7awZ+8+ypT2YeDAgTg5FbJA\n/89AloTOjsqEfMn8HzpbMQoZoFOnHmzc6ImUpqUNe/sF9Oun4aef5j0mJ6XExaMoiR/8DWWqgkGP\nyycNWTZjIp06dXrm9sdN+ICZR+6gG/i9qSD0ZxqeXcb+Xdueuc7cRkpJETcN+wcnUbUU6A3Q8CdX\nJs1aRseOHZ+53vfen8CPsZdwnzsRgIc/raLSb6H8tVXJVFcYyQqjYIzJuLxd8fxvFAqU91F2cfNm\nFFKW/++zwVCe69ctXx+Sk5NJSogHn0qmAnsVCUX8uHbtmk3t34iKRlc6lYuybxC3o6NtqhMgLi6O\nEcMH0vT5mgwe9ArRWVBnRtFqtSRqk6lsXoVT2UOVEpI7d6y72maUG9FRiKCA/z47VA8g2sY6FQo3\nBlXGr4KAYhQyQKdOrdFolmOKgBuNRvM7Xbq0sZBTqVRIBydY8QEkJ8LZvcjwbWn6/me4/Rda4bDu\nUxhZDYYHoprTj/atW9pUp9FopGOH5mjvLmXyyJO4ipW0btUYrVZrVT40NJTqDepSOsCfQSOGkZiY\naFVOSsnnn8/A168q5SoE8cMP863KOTk5Ub92EB/utCdRB3suwZZzksaNG9vUr46tWqOfvRTdhasY\n7sahnTyXdi1t+60UCjeFzSggpcxXl0nlnEWv18shQ/4n1WqNdHJyk++8M14ajUYLueTkZGln7yDx\nbSaxd5C4+0on38byl19+sVrvnj17ZGm/QKlSO8pa9Z+Xly5dsiq3detW6eBUSlJ+l6TicWnvUkt+\n8OFkm/p07tw56VtaIw2XkfIq0ngFGVTFTR44cMBC9vTp09LVs5j0Wz1NVj6zTJbsGiJ7D3jNar2z\nZ38nNcWDJLVCJTX/lJoi/vK331ZYlb1+/bps/lw96aCyl2W9PeXGjRtt6lMKUz7/TLoULSLVGo18\nddAAmZSUZHOdp0+flpVqBkuV2lGWrxIkjxw5kgWaKmQ35vHCpvEmTq/O8GVre3nhynUFnuUvKbcw\nGo1WjUFq2nZ8UeLbUtJ1p6TxZ1Lj7ilv375tIXf9+nXpWqSEpNd6yXsPpV2rz6R/pSBpMBgsZAcM\nHCbx/loSJE2X/9+yYqVgm/py4cIF6V3KWWojTEZBfwlZqaKrPHTokIXsjBkzpPf/eshacr+sJffL\n6lGbpMbD3Wq9wfWbS9ybSYRGIpwlHi1ku44909Xlab/ps5JV9SYmJsqSvn5SvDFP8utDyZu/yqKl\nfOS9e/eypH6F7CMrjEKsdM7wVRCMgrJ8lAmEEE8Ng3A+4oLpgNGhj+BGKEaJ1UBvYWFh2JWuB4Ed\nQe2CseG73Lh5k6goy70KdzcX7Iw3HhXob+CahutqRilfvjx16zWmx3Bnlq6FPm864eVTmdq1LTMB\nuri4YLzxaLdNd+MOTi4aq/VG3b4FSW5QLBaKRUGClhs30j95m12hJbKq3vPnz5No54RsMwScXKDJ\nKxiKleHUqVNZUr9C3saAfYavgkBBWQXLE2i1Wi6dPwNvHgfzgGS/4zWOHDlicSjM09MTQ0wE6LWg\ncoT7kRiTE60eSBoz5n8sXNyQB7cSMIriOMV/yxfzl1rIZQYhBL+v3MiXX0zjjz1hBFYPYv74j6y6\ng7788st89vVMbr42BbsqZYmf+wdfTJpstV5nJ1dwGm06JY0jOI3AVfOLTbpmJxcvXmTXrl24u7vT\nqVMnq26exYsXJ/nubXgQA27FIfEBuqireHp65oLGCjmNvoAM9hlFMQpZiFqtxsW9KA9uHgCfRqBL\nhFuHKVPmVQvZ5557jlbPB7Nj6fMk+zRCFbGOyVOmoNFYvoG7u7ubThXHbcFOOGInDFYPxGUWR0dH\nPvhw0lPlPDw8OPr3AeZ+P5fomBja/fgzbdpYbrQDVK9emYjrezDSCqTEQe6lZs3KNuuaHezbt4+2\nnV9E1myHXcxVyn8xk4N7duLs7PyYnI+PDyOHD2fuh43R1WqLw6kd9O7Vg0qVKuWS5go5iaGQDZPZ\nek5BCPET0AGIklIGmcsmAYOBFP/HCVLKzebvxgMDAQMwSkpp4YiflecUbt++zeefT+f27Ri6dGlL\nz54905Q9cOAAP/ywEJXKnhEjhlgNBwGwYcMGevUZgL1vE4zRp+nY+jmWLV5gdSnj/v37DBo8hIiL\nlwhp8jwzZnyJnZ3lit6E9z9i+tIz6DSlQSaBdKFxyTP8tTvbQqo/M9euXaNe/WYkaCsCWjyLxHDo\n0G6KFy+e26pZUKlmXf5tMQHqdQMpcZ71Ip/3acnIkSOtym/bto1Tp04RGBhIhw4d8mxEVYVHZMU5\nhSvSeph6a5QTUTa1lyfIzg0LTImpawPhqcomAm9Zka0KHAccAD8gArCztvGTFcTExMhSpcpJlaqb\nhDFSoyknP/98ulXZXbt2SY2muIThEoZIF5di6XqfXLhwQf72229yz549aW52arVaGVSnoXSs2lvS\nfJ7UlGssBwweZlW2y4s9JS4e0mHcO1L9xRRJ0RKypE/5zHc6h1iyZImsVa2arFOjhtywYUNuq5Mm\nRUuVlny/dJ3iAAAgAElEQVR9WfKrNF3dJ8tx4yfktloKWQhZsNF8UXpn+HqyPeAn4PYTY+AkIBI4\nZr7apfpuPKZEOmeBNqnKg4Fw83ff2NKnp/Y5Oys3d8bPilEYa0VuPPBeqs9bgIbW/pKygu+++046\nO7eUsMt8LZJubsWtyoaEtJfwvoR95muUfOmlPja1v23bNunmW1fyP6NkpJS8EScdHDVWPVpat28v\nHd4dI13jo6VrfLR0WvmrLObnZ1P72cWqVatkcY1GdgfZFWQRZ2e5c+fO3FbLKp1eelnaV6wjhX81\nKSoFS8eipeSWLVtyWy2FLCQrjMI/slyGLytGwdYX45TVnDCgvvl+E9DWln6ld+WW99FIIcQJIcQC\nIURKmi0fTNYzhUigdGYrjoiIoE6dRmg0HlStWpvw8HCrclqtFqMxdfA1V/R6nVXZxMQk4HHZxETr\nh7wyilarRTh5/LchjUqDsHdAp7PUwbdcOUSqbGTCw51ixYqmoWsivfsPwrWYJyXLlueXJb/apGdm\nmTN9Oi0SEggCagHPJSby/axsSyebJnq9/qmHBkv7lMChuIFSv03Cc2o/IImSJTO+VKBQODCgyvD1\nJFLKvZhOvT6JtSWmLsAyKaVOSnkZk1FoIITwBtyklGFmucVA1yzpnBVywyjMBcpjGjNuAumF28zU\n5kFycjIhIW04caI8iYlT+OefOoSEtOH+/fsWsp07d8bBYR8mo3sGZ+cvePnll63WO3RoPzSaucAh\nYD8azc+88UbfzKhmQZMmTXCMj8Du6Odw6yCOe16nbr36Vtfe+/fujeqbb9Gv24hh71/YvzWON/pY\nb3/oqDGsiYgl/v31RA+ey9C33mXPnj026ZoZhBAYMcWIjQOMgLCyT5JdSCl5Z9z7OLu44uruQbsu\nL6WZJW7l2jUUWzwNx7rV0XRrg9MbvVi91npSJIXCSza5pGbmxfjJ8us8wwtzRsnxbXWZKuagEGI+\nsN788TqQOuxoGXOZBZMmTfrvPiQkhJCQEMCUtD0uLhmjMcUzpgkGw35OnjxpkafY39+fXbu2MHr0\nOKKjt9Op0wtMm/axVZ37938NnU7H11//gL29igkTZtoU4A5MHj0H9u5k6Mi3uXT6dxo3qMvsr1da\n3bxs3Lgx9apVZu+w0Qg7QSl3N/r3tW4U1m1Yj97JBYeZL2J4EE9Sueps2rI1S6KaZoTXR46kz/4T\nmPbajNjZCbaOGJEjbQMsWrSY75ZvQv/+NXByJ/T31xj99jjmz51tIZuYmIh99F0cAk1xrQw3o7iG\nbTNAhdwlNDSU0NDQLK0zvcH+cGg8h0Otp3tNh7lAymDzCaYXY9tSGWYhOW4UhBDeUsqU01wvYto8\nAVgHLBVCzMRkBQMwraNZkNoopKZIkSLodPeBeMAF0KLT3UkzEXzt2rX54IOxxMbG0rhxY6tZx1J4\n/fXBvP764Kd1L1P4+/uzbaPVPN2PMWv2LC49PM/H13ujclKxZlgog4cNYu3vf1jI6o1aKr5an2qf\ndEcXl8CuRh8TefVKluqdHqGh+7FTtUan+xrQ4+DwOrt3/0WLFi2syp87d45Dhw7h5eVFy5Ytbfbo\n+XP3XyTUHQKupkh7Sc+NZdefb1iVVScn8bD7cJLfGwqXrmJcuxmPfq/b1L5C7pL6JRFg8mTr52ky\nQ3rnFGqFuFMr5NHZoh8mPz34YiZfjCPN5WWeKLf6wpwVZKtREEIsA5oBnkKIa5g2WEKEELUwLQ1d\nAt4AkFKeEUKsAM4AemC4eaMow3h7ezNkyOv89NMXJCYG4ex8js6d21GtWjULWb1eT6tWHThw4BxS\nemFnN4ING1bRMg8GT1v2+zKC+1dCrTHlZGgwpBq/dNxqVVYloPzgZgghUBdxoeyrjfCMtUwIBKal\nlo0bNxIeHk5gYCDdunVLc1BOTExkyZIlxMbG0qJFC+rVq2dVLizsODrdUFIymWu13Tl4cKdV2VWr\nVtOv/1DsiraE+JO0aVGHlSsW22QYypXxRv1nGMlyKAiBuBZGaW9vq7IVy/vS2uss0RtnUsTBwNEy\nagIClbMHCo+T1ecUMvtiLKWUQoj7QogGmF6U+wLZtlGXrUZBStnbSvFP6chPBawnH84g33wznVat\nmhEeHk5AQD969OhhdZBZsmQJe/acx9SkPXCCl17qw927liEpchudDk5tuEaDIdWws7fj9IYrGNJw\nhQ6sXInbG4/jP6I1xmQ9cdvPUr3vC1Zlx7z3Hj9t+APZoQV2ny7jjy1bWPTDDxa/V2JiIs0a1aWE\n9jKVPZLp9JkDs+ctpIeVcx1VqlTk1Kld6HTPAxJHx1CqVq1oISel5LUBr5NQfiu41gVjEtt21WPb\ntm288IJ1fTPCu2+/xYpVTbm1oCVoimJ/5W++373Dquw33/1Elw5taFEZTsWCqmgggwdn7WxQIf9j\nS/iKLHwxHg4sBJyBTVLKbDukVGiT7PTs2ZPff4/DdFYOIAF4HSn1NtVrMBiYOfNrtmzZSdmypZk6\ndTLeabypZpSZM7/i/c+m4FjEAUd3J+5dukeLJi3ZvHaVhew///xD8xdaoS5blIRb92hUqy5rf1tl\nEb7i5s2b+FerivP5fdgVLYKMT0BbpRlh2/6katWqj8kuWLCAVdNHsbFzAkLAgevw8g5PLt+wzL8Q\nFRVFo0YtiY62R8pk/P092LdvG25ubo/JJSYm4urmgbGe9j8PLJfr/Zg9qTkDBgyw6fdKSEhg8+bN\naLVaWrZsme7p76tXr7Jnzx7c3Nxo165dukuIuYnBYOCrr6eze98OSvuUZeIHn9j876owkBWH13bK\nRhmWbyH229ReXqBwnd9Ohcn1cBvQHigFrEMI2weEoUP/x9Klu0hIaIxKdZnNmxtw9uzJNPc1MsLo\n0aP4K+wwGzZuICnGSIWyFfll/jyrslWqVOFc+BmOHTuGq6srwcHBVmdKsbGxOHgWw66oSS/hosHB\nx4vY2FgL2bt37xLoofvPe7ZScYiNe2C1/ZIlS3L69CEOHTqEvb099erVs5qK1NnZmQoB1Ym4OR3p\n/TYknMIQs4V69d7N6M+SJhqNhu7du2dItmzZsvTp0+epcjExMQz/39scO3GKqlUCmTtneo4OyiNG\nDeVA+Do6jSrGv2H/0vj5LRw/ehoPD48c06GwUthiHxXamcLx48epV+959PpkTLM4D0JC6rJr17PP\nyvR6PU5OGgyGasAVwB1nZwfmz5/MK6+8kuZzBoMhQ3mJIyMj0Wq1+Pn52ZTHGExv00V8vHGY+BaO\n/V4iee1WEt/8iJsXLlr46h89epR2LZqwqmMClYvBu/vUPCzbihVrN9qkQ6s2ndgZGobUx4FQ4ezs\nwKULZ7MkrlNWYjAYqFmnMefvNyC52Cuo7v2Br1jPmVOHsyRPstFoBLAa4gRMubxdXTX8FlUXFw/T\ne9zEDpcY1Xd6mm7UCiayYqawWYZkWL6dCM33M4VCGzq7Vq1aDBs2ENMZEgNFiqiZP//bNOWNRiOn\nT5/m/Pnz6dZrMKgw7RHNBQaSmHjdajhsgD///BNPT18cHNRUqhz81LrLlClDhQoVbDYIYIpRpNK4\nkLh8E3fLNyRhzmKcvX24ePGihWydOnWYt/BXXttdioo/O/OwbCt+XGRblNbk5GRCd21D1r4M9e5A\n/fvYF23F9u15L5fy+fPnuXwtmmS/b8CjIfqyU7lz354TJ07YVK9Op+O1IUNw1GhwcnVlzHvvYe2F\nR0qJBOzsH4019iphVVYh6ylsobMLrVE4ePAgCxYsA6YAvxAfX5uBA4dblb1z5w7Fi5ehevU6BAZW\nw9u7PAkJlr7JRqMRIZKBNzEtSTUF6qBSWa7SRUZG0rVrb2ISFiFdkjl/9TVater831tjdqNWq8Fg\nQLV2A+rL11Ft3YkwGNNcU+/atSsXrt3i3oMEVqzdaPOyhZ2dnWlZy/gQ7F1B2IEhLsfX9BMSEggL\nC+Ps2bNpDrJqtRqjPhFksqlAGjDq423WddKnn7Lm4r9orp3E6d8wFuzczrdz51pt/5U+vZjS/RJh\nm2NZMvk6l4/rbNqQV8g4+dEoCCGeWZlCaxT27NlDUlIwUBZQodO9xN9/Wz/527JlO+7dK43JcWoB\nt25p6NjxRQs5lUqFg4MjkOKrLNFoHuDj42Mhe/jwYezVDUDVAoQ9UjWKqOgYbt++nUU9TB8/Pz+a\nN2mCuk9vDL8sQj3wNWr4l08z+mtmuH//PuvXr2fTpk1WjSeYfqtRo8bgFNEcLgxFdb4zxZ0iad++\nfZr1hoWFsXLlyqfOqDJKREQE/tWq03rwEOq0aEmv1/pbNcrly5enUqA/nGgNN+ZDeAe8S7ra/Ftt\nCt2FcexwhIc7diU90Y8czObdoVZlf/z+Z9o/N5gdX3lguFiPv/aGUaxYMZvaV8gYeuwzfOUhzgsh\nvhRCVH266OMUSKOwY8cOvv76azZu3Jjm29/NmzcxGi9gCsQAcAkprf+lnj9/BWiDaV9eDbTixImz\nFnJ2dnZMnfopGs3bwEKcnT+gUiUXOnbsaCFbqlQpDLqzIBNNBcZLGA0JNm1IZwYhBGuXLWNi+3a8\nePQw4xo3Ysf69WkuTWm1WpYsWcKsWbM4efJkmvVGRkZSuUYd+nzwDS+/PY2gug2tbl4DNA95DqPx\nOqoKlxAOJ6j/fH1c0sgoN3LsOzTv3pNBP/5KrcbPsXTZ8sx3+gleHfIG0a8N4/6WgyTuPc2mM+dY\nsmSJhdy9e/c4e+EM9r2DsKu2E/teFYmMumo1o15mKF2qFPL4o+xtdsdPUaak9f0UBwcHJn30CTu3\n/c2SRb9RtmxZm9pWyDjJOGb4ykPUwhRRdb4Q4qAQ4g0hhGUGL2tkV6S97Lp4SpTUceM+lC4uZaSj\nYyfp4lJeDhpkPRz11KlTpRBeEipJaC7BXdrbO1iV9fUNlNBRwnIJyyQ0k0FBaedI3rp1q5ww4X05\nZ84cmZiYaFXGaDTKl18eIF3cqktn98FS4+IjZ8/+Lt2+5RZJSUmydsPnpUuN5tLxhWHSuWgJuXbt\nWquyL/XuJ+07fiCZJyXfG6W6+VA5YvRbFnJGo1G6eZaQ/Py35IiU7IuXLhWqyO3bt1vIhoWFSU2Z\ncpLD9yTnpGTdSenk5i61Wq1N/Sri7SMJ+1dyPcl0vf2RfG/cOAu5EydOSPcqlaVzXOx/V5H69eSe\nPXtsav/cuXOyqLeX9OjRVXp0bie9yvvJGzdu2FSnwuOQBVFSl8juGb5sbS87LiAE0wnoBGARUDE9\n+QLlknr79m2++uprtNrvAA+02gSWLv0fY8eOpEqVKo/J1qxZE2dnNQkJbQEdUIqAAMu3f4ANG36n\nTp1GGAynMIVuiGP9euvRVwHatGmTZmayFIQQLF26gM2bN3P16lWCg4ekeUo4t1m2bBn/JjkT/+5W\n05mC+r0ZMvI1unTpYiF74cpVDPXNcZmEINm/KRGXLUN5JCUlER93D4IamgqcNVC1LlevWuZzvnbt\nGqrKNcHNvI9RKQgc1MTExNjkFlq5WlXC1q7AOOJtiH+Iy45NBI1900KuXLly6G9HYTxyFLvgOhj/\n+QdtxAUqVrQ8lJcZAgMD+XH2HKZ9NQuVvT1T5y9Qzh7kQfLYslCGEEKoMCU4G4ApDPcMYCnwPKYo\noIFpPVuglo9iYmJwcCgKpGyCalCrvbhzxzIeSbt27Rg06CUcHRfi5raFEiX2sWqV9TDTNWrUYM6c\nGZQubUfZsk4sWjSPcuXK2ayvEIL27dszdOjQpxqEXbt20axDW+q3DGHe/B9z1PMkOjqaZJ9qj8J8\n+wYRF2N5cA2gaaP6OO37DnRaSHqI5uB8Qho3sJBzdnbGL7Ay4nfzxurlc8gD2wkODraQrVmzJrpj\nf8M/x00F637F3c3V5jDXS3/4Ae8Vi3BrXgunxlV4MbiWVddhDw8Pfl2wAPuXeqJu3AS7th34YdYs\nmwfwLVu20PeNERyp2ouD/p3o9FIvwsKshvtSyEVsCZ2di/yLKRT3F1LKWlLKmVLKW1LKlYD1GDkp\n5PbU5lmmc2mRlJQkS5b0lUKMkLBSwrvSw6OEvHv3bprPXLlyRR47dkwmJCSkKbNs2TKp0fhImCbh\nY6nRlMzRZCz79++XajdXaeddTtqVLCPVniXkrG/nWJXVarXyf2+9Lb3KB8iKNerI9evX29x+WFiY\n1BT3kkw5LFnwUKpbD5WtO3a1KpuQkCDbdekuHZxdpMrRWb7cd4DU6XRWZc+ePSt9AypJR4+i0tHF\nVS746ec0dVix4nfp7O4hHYsUlSXL+cnjx4/b3C+tViuHjnpTepYtL/2qPT1LXGxsrDxy5IiMjo62\nuW0ppXyuVTvJmGWSVdJ0Dfha9uzTP0vqVjBBFiwf/SD7Zviytb2suDDF7fnoWZ9/6uE1IcSXmMK7\nJmLKhlYTGCOl/MUGK/bMPO3w2pkzZ3jxxd5ERJyhTBl/Vq36lbp169rUZqNGrTlwoDmmECYAG+jY\n8QLr16+wqd6M0r5LZzb/eRBq/gYOReGfwRRXR3En0jL66fDRb7FwXziJA76GO9dwnv0aoZvXU79+\nfZt0WLZsOcPffIuH92Jp0qI1K39dlK73S2RkJCqVCi8vr3TrlVJy584dPDw8nuriqdPpuHv3Lp6e\nnmke9MoMw0aPYdGh0ySO+QpuXUXzcX9CN23IsWW8Bs3bENZgBNQ3L8Nt/5Gu9/awZlmu/NcqkGTF\n4bW5sn+G5YeJhTa1l1UIIQ5JKZ/pH3JG/me1kVLeBzoCl4EKwDvP0lhOULVqVc6dO4Fen8yVK+ds\nNggAN27egMfi7Gu5di0yLfEs5+LFSCg/ATxDwKMmBM7iQZz1rGK/r1lD4uBvoWw1qNOWpNZv8Me6\n9VZlM0Pv3i9z9/YNkpMS2bl5fZoGISkpiY7deuIfUAnf8v68OmAwen3a8aSEEJQoUSJDPv8ODg6U\nLFnyqQZBSsmFCxc4ffp0um3/vnq1ySAYjeBflcQuQ7Lkt8ooIwf3R/PLGDi8AfavRLNyIsMH9sux\n9hUyRn48pwDsE0LMEUI0EULUSbky8mBGFsFSZDoCK6WUcUKIPH+UMiPhl6WUhIWFERsbS3BwcJpr\n1GpHBxCzQcYDOhALcXS2XPt+FsLDw7l69SpBQUFpuhnWq12Lc39de1SQdANvL+uuixqNC8TegNKm\nfSSHu9dxd0tzTynTPO13Hf/hJHbc0KH7PhYMOtZ+3YkZX33De++MzTId0kOn0/HiS6+yM3Qv9moX\nSpfyYM/OzVb/btVqNU5vtcTNOZmHsVpk8dK4Dcy5XCd9Xn0FKSXf/PAVKnt73v9pHq1bt86x9hUy\nhjZvuZpmlNqY4vc8mTms+dMezIhRWC+EOAskAcOEECXN9/kao9FIjx592Lp1HyqVD0bjBbZtW0/D\nhg0tZAMDKxER2xh0t0DYYafqRtUqtse8Gff+RGbPm49D6SB0V4+weP48unfvZiE35ZOJrK1Zl4cH\nToKdM+q4Pfy4zvrS1ajXB/D2xBeh69tw6wKEraHvz6dt1jWj7P77IEkt3gcHR3BwJKHpEEL/XsN7\nacifPHmSAwcO4O3tTYcOHWxeFpo951t2nrhLYsfLYKfm4om3eWPEW6z53fL8QakiKl7sZmTw+14k\nPDTSr8EV3N0z5sqdVfTt8yp9+7yao20qZI48NgPIEFJmImDTE2Tkf+Ak4DmgrpQyGVNaM0tfxHzG\n6tWr2br1OPHx84mL+4wHD0bSq1d/q7IzvpyMu341ThoVzk56ioo/mTxpvE3tHzt2jNnzFpAw6gRx\nr20hYcBW+vYfSHJysoWsu7s7RdyK4xBnQHVXjcrePs1ZzW9/rKT68HpU9d5LzeejKFmzFBs2bLBJ\n18xQvpwv9uf2mj5IicP5vfiXLWNV9tely2gY0poxyw7yypiJdOzW0+YwH0eOnybRqzvYO4IQ6Hx7\ncyLculG8dSuKDn1NnmoaVzteeNmVq9dyLkudQv7AluUjIcRPQojbQojwVGVfCiH+MedoXi2E8DCX\n+wkhEoUQx8zXd6meCRZChAshzgshvsmI3kKIjkKId4UQH6VcGXkuI0bhbylljDQnGpBSxmPyc83X\nXLp0ieTkIEwnlAGCuXnT+oBQuXJlzpw6wvT36zFzYlPOnDpi84nSy5cvoypbB1w9TQVl6oDK0ar7\n7JdffkVU9PPojDvQG38nIWkqw4dbf/e+fOky7pVKkXg7EW1sMh4NynDh0gWbdM0MX3/+KZ6Hf8Ft\nemvcPmtGmSt7+Pij9y3kpJQMGTacxNHbSei7gIfvHmBv+AW2bLEtd4hfGS+4sgKMOpASLq/Aq5Sn\nVdnAShXZtSYegKREI/s3G6lcqYpVWYXCi41hLn4G2j5Rtg2oJqWsicl1NPUbZoSUsrb5Sh2MbS4w\nSEoZAAQIIZ6s8zGEEPOAnsAoTFE/ewIZ8qNPc/lICOEN+AAa8waFwLRG5Q5oMlJ5XqZOnTo4OHyD\nTtcTKI6d3QaqVaudpnzp0qUZkYEE9Hq9nqmfT2XH3l2U9vJh2uSpVs80BAUFob90AG6dAa+qcHI1\nGie11bDRkZG3SdbVSWXC63DzpmXgNICSxUuyf9R2ZIWPIOky4vJ3jJzd46l6P41z587xwccfcCf2\nDh3adOCt0W9ZXerx9fXl3MljhIaGYm9vT4sWLdBoLP+5aLVakhLioXR1U4FKjSwdZHPsp6T4OMok\nh3F3XWnsHDQ46qOx01g/ZDZ/3hJatWnK5sXRxNxOokXztvTt29em9hUKHracP5BS7hVC+D1RljoU\n8EEg3eQf5rHYTUqZcohlMdAVkzdoWjSWUgYJIU5KKScLIWY8Rf4/0uvtC8BrmOJAz0hV/gCYkJHK\n8zItW7bk3XeH8emn/XBwcKFEiWKsXr3Z5noHDBnI2i3b0esFKnUE27fW4+ypfyhevPhjchUrVuT7\nWV8xZFgj7JxccVIJNq9fYzX20AsvNGP16mnEJ3UFiuDkOI02bZpZyAHExCQia6+BYqZsUUIXR3S0\n9YNmGSUyMpLnmj1HrbG1KVG5FN9++i1R0VF8MfULq/IeHh5WTzunxsnJiSo1anN201QM7cfDlaPI\nU1to2NC2Zbn4B3G8Uy+eEL94tHpI0MHYI9aD8gUEBHDmVASnTp3Czc2NKlWq2JQfOjt5+PAhI0a9\nQ+juffh4ezPvu+nUqFEjt9UqFGTznsJAYFmqz+WFEMeAOOADKeU+TGNwanfH6+ay9DAHVSNBCFEa\niAHS9w83k6ZRkFIuBBYKIbpLKS3zPhYAJk58nzFjRhIXF4ePj89T8xQkJycjhLCaSQxMs4RfF67G\nTt0Og/Yt4CQJqrEsXbqUkSNHWsj37fsq3bu/yJ07d/D29k6z3lde6c3ZsxF88UUABoOeF1p356uv\nplmVNRj04PAorLVUF0er1aXZJyklCQkJaDSaNAfE1atX49exPA3eMRmakjVL8UOdH9I0Chll05rf\n6dCtF6eHTsLVoxgL58+zCEeSWTq+2IORA1fSsEwCxZzh7d0aOvZKe6bk4uJCgwaWJ67zGt179GX3\nPxq0ZX7hatwhmoS04ezpY0pYjBwgPaNwJfQyV0KfbR9KCPE+kCylTElOcgPwlVLeNa/OrBVCVHum\nymGDEKIo8CVwxFz2Y0YeTG/5qK/5gJqfEOKt1F9hOrU38xmVzVO4u7s/1eMkOTmZPn0Gsnq1yeOn\nf/9BzJs3x8KI6HQ6pEzEoP0XUwwqJ4yGSpw5cybNujUazVP3J4QQfPLJR0ye/AFGo9FqfoYUBvR7\nlTmLBpNQ4UtIvIrzzR956aU/rcoePnyYDl16EBN9E/cixVnz+1KaNbOcgQghkIZHXsjSYMySN+qy\nZcsSfng/Op0OlUr11Drj4uK4desW5cqVSzPjWceOHbk15Ste/XQi2uRker/Slw8mPemVl7/QarXs\n+HMThhb3TRvo7rUwPtzGjh07MpRKVME2tKR9hsYrJBCvkEcu3/smWw+//yRCiP6YcgG3TCkzO/Ik\nm++PCiEuAAGYZgapvTXKmMvSREqZ8o9+lRBiI+AkpbyXEd3SWz5KWQh2w7SXkIJ44nOB58MPP2bD\nhvMYDL8CBpYtm0rlyrN4++0xj8mZDmA5AsHANOAyyHfx8emaJXrY2dk91WVz+LDB/LhwIYlHuoA0\n0LJjK6vLDImJibRp14W7AR9BUEXuJlynU9eeXIo4Y7HU1b17dyZMnIBmkgslqpdg70e76f1y7yzp\nE5DmDCk1P87/iZGjx+DgUgJ7Yzyb16+mUSPrCdUHDxnC4CFDsky/3Mbe3t5kMPV3wd7LtIGuu4Oz\ns3Nuq1YoyOqYRuZN4neAZlLKpFTlnsBdKaVBCOGPySBclFLeE0LcF0I0AMKAvsCsNOruzqPx+bGx\n2ny62zI65ROkt3w0z/znpKdVktc4evQop06dIjAw0Oq5g8yyY8ceEhM7AKa304SEtmzbttvCKJhO\nz2ox7RsJoDxQC09P694vmSExMZEtW7ag1Wpp0aJFmi6prw0bwf2XBiJHToKH99nZL4TffvvNIpfv\nhQsXSNJLHE69S/FKJYn5NwqDoydnzpyhSZMmj8mGh4dj0BTnwCEv7PbdR1e6GfsOHbe5Txnl3Llz\nvDl2PNq2h9F6BMDV9XTo8hLRN69mSWrSvI5KpeKdd8fxzbzWJJR4HceEMHzc49JNSKSQddiypyCE\nWIYpPo6nEOIaMBGTt5Ea2G6eHe83exo1AyYLIXSYEr28kertfjiwEHAGNkkp09o07kT6L+3PbhSE\nELNTfZSYRrn/PkspRz2t8tzgyy9nMGnSNOzsKmE0XmDkyMF89tkUm+r09fXh+PF/MRhM3kkq1Xn8\n/Cz3eUwzBRWQMuvTARd58OCBTe3HxcURHNyIyMhkwBm1+n8cOLCHqlUtkyqdPHkCw9tzTBFN3TyI\nb92do8dPWBgFR0dHdElR/O/wK5SqVpzYS3F8HbQkJaDWY5w4cQJd8+7ox5j9DR7Eca790/a5so7T\np1WWdjMAACAASURBVE+j8moIHgGmgrKdSAwbSFRUVKFZU//0k4kEVavEnzv3Uc63MmPGzFVmCjmE\nLUZBSmltSv1TGrKrAKv7t1LKI0BQBtrrnxn9rJHevOgIj4zBZOAjHhmGPLl8dOfOHT78cBJa7Tig\nGPCQWbOmMWjQawQEBDxzvV999RmhoQ2Ij/8H0FO06D2mTJlnIWc6eKYHPsR0yvwy9vZ2NqdNnDz5\nYy5ccAQGAQKtdi+vvDKA48cPWsiW96/Avd2bkK8Oh+RkNPu3U2mIZTydpKQkipUuQqlqpqWiYuU9\n8KpUwmr7FSpUwHHRcvSJCaa8B39toox/BZv6lBn8/f3RRx2GxChwLglRB7DDkCUzsPyCEILevXvT\nu3fWLdspZIz8mE8BTIfXgKqkLHHw2F5DmjzN+yil8tFSykU26pjtREVFoVYXQatNGYRdUatLcfPm\nTZuMwvXr19Fq9eh0voCRhw8juXXrlsUSjqOjI6VL+3H9eiNMOR2CcHBYbrPr4NatOwF/HtlkP/45\nG2pVdsm8uTRp8wK6jb9iuHOLpnVq079/fws5Pz8/tPf0XN53Hb/nS3PjeBT3rzwkMNAyTlK3bt1Y\nuX4j63pWwcGnHFw9z4pNOXdKulatWrz95nC+nBmEungVdDGn+O3/7J13eFTF94ffu7vZJJsCIUCI\nEDpBOkhvoUgVlKY0UVGsKCIICigSFBQRxMpP5AsoSFWUroB06b1KDwQIBEhCSLJ99/z+2KDo3oTE\n3SQQ8j7PPOzenJ2Ze3e5c2fmnM+ZNztLexEFFOApd1mehCyRHrzmD7TG5XX0BK6YiDty751tJpQr\nVw6t1gLsx/Wkfhy7/YrqMkt2eO+9iZhM0YBLo8Zo/IoPP5zCggWz3GwfqvcQly79AkoQSDIarUH1\nRguuqOpXBg7n7NnzNG1any8+/5igoCA3u8JFCoNmJzhrAX6g2YLeT12kq0qVKpw5cpj9+/cTGBhI\n3bp1Vb16goKCWDh3Ib269sIQ4kfq9TRmTJ+hKnWtKArzZs3g0KFDJCQkULt27VxPGv/MU31Yt3kL\n52Iv0PSxzqpeUgUUkBPci9pHeBC8lq8yr+n1ekJDiwBzgcHAdAIDDQQGBqra//bbb5Qt+yAhIWH0\n7NmP1NRUVbvUVCNwu0dOUVJS3AOi7HY7K5YthQfPQKVNUOUSRmoxb948N9vk5GQaNmrF2u11OZXw\nJfN/SeORTk+orukPG/o6Or0JlHGgvItGf4lnn1F3RbRYLAx5exTde/ejW5+n+OmnjENM2rdvz6Xz\nl1i/fCOXYuN44vGeGdoqikKtWrVo3bp1rg8IN27coFFUa7YXacmlnjNYes5Jp+49Va9VAQV4Gyv6\nLJe7iH8Hr9nxNHhNUZRU/t478FcU5fbdUhGR3JWTzAJnzpzh8uVEXGv6NkCPyTSN/fv3u7kvHjly\nhO7d+2AyPQeEs3z5L/Tr9xxLlrirjz73XE+OHBmL0RgCWDEYJvLcc5Pd7FxxCk7QGMAn/fpr9Bw+\n7J7PecuWLVgcFXAaXBG8FqnHrl1FSUhIcFsr79atG5M/jmPMBx9itZjp06cHkz75UPUavPrGMBbs\nOY9pxBaSEmLp/0pvSpZ8gCZNmqjaBwQE8OCDD6r+7d+kpKT8FejnjSQ3WWXLli1YilbC2XEEAJay\n9dg5JJTExEQ399kCCvA29+iewvJ/Ba8J8L+sfDCzPQX1x+u7GF9fX0RsgANXvIADh8OsmsBlzZo1\nOBz1ubWhbzb34bff1EXmXnzxeSwWC59/PhKtVsvIkdH06OEuV6LX61G0fkhsdwgdCqY9YNxB3bqP\nq/fVcdPlc64oIEZE7Bmuk7/++qu8/vqdtZcW/7IE01sboHg5KF4OY9QLLF+xMsNBIau8P24C48aN\nQ+cbSFjxomxYu4KyZct6VGdW8fX1RUy3XSurEXFkfK0KKMCb3Et7CoqiNAAuiMgH6e8DgcPAceCz\nrNSRr5aPSpcuTZUqD+IaEHcB31GiRGFq167tZhscHIxOl3jbkesYDO7r+eBaOunevSt9+z5K376P\n8uijnVXttFotbwx+DZ2chIRRKMZFlCpZzM0dFKBFixaUidDia3oKjNMxmB+hb59+FCpUSKXmrHMz\nNQ2un/v7wJVTbP1ji0d1rlu3jolTpmNrfQpT68vE+vejR89nPKozO7Ro0YLSgeA76ynYNB3DVx3p\n9/QzuZ77oID7k3ss89o00tNEKooSBUwAvsGlpeTuMqnCHXM0321klqM5OTmZEiUiMJsbAKlAAH5+\nezh58ggRERH/sE1LS6NOnYZcuBCIxRKGv/92vv76E/r3d7/Z7dmzh4YNW+B0lgNs6HSXOXZsr6pH\nk4gwc+Z3rFm3hdKlSjByxLAM1+BTU1OZMGESp07H0rxZPQYOfNnjZRnF1x/8gqDVC5AQC0fWUqdi\nBPt27/7PdU6cOJF3ZsZjr5q+ZGZLRv97SSwm9T2YnCAlJYWPP5nMyZjztGjUgFdeeSlXl7AKuDfx\nRo7mRyXrudiXKz3zNEezoigH0yW5URTla+DarQDk2/+WGffOvCgLXLx4ER+fQpjNfy/t+Ppe5uzZ\ns26DQkBAAPv27WDGjBlcu3addu3eJCoqSrXerl374nT2AF4CBLv9E7p168ORI3vcbBVFYcCAZxkw\n4Nk79jcwMJBx46KzfoJZwC+wEOa2z4HOB4rVhGsnqRKpLh2dVcqWLYvvzZ+wOywu7Z2rvxNesqx3\nOpxFgoKCGPd+tNfr3bJlC0uWryAkOJiXXnqRYsXUYzUKuH+5x/YUtIqi+IhrHb0NcLveS5bu9/lq\nUIiIiMDhSAHOAWWBy1itcVSsqH5TDAwMZPDgwXesNzHxJnAr57WCKyHPQW902ev88O3XPPHk00jZ\nmpB0mUIaJ9Onb/Cozscff5x5C5fw+6bqaIPKIzcOsODXpV7qcd6xaNGPPDtoMMYeA/E5fY6v6jfk\nyJ5d91VQXAF35l7aU8Alw71JUZTrgBHYAqAoSiUgS4J4+Wr5CGDZsmX07fsMWm1hbLZEpk37yuPE\nKQ891IT9+wX4ENcm9lDatYtg9eqVHtWbEzidTka+O5rZ8xYSGBjA99/+n8ebzOBaFtu5cycJCQnU\nq1dPNRnQvUbpKtW4MHQq1HXFPOjff44PmjzIW2+9lcc9K8BbeGP5qKVkPc/KRqVjni4fASiK0hiX\n++ma9EyZKIoSCQSKyL47fT7fLco+9thjXLwYw8aNP3PhwtlMB4RPP/0UX99CaDQGKlasRmJioqrd\n+vWrKFYsDlfeoY5ERFhYujR3U0zMmzePcuVq8cADlRk1agwOh0PVbuy4D/nqp9Vc6T2V001ep13n\nLhlKd1+6dImHH+5C8eLladq0PWfOZJy2U1EUGjVqRKdOnfLFgABgTE2F4n9rONmKluRmivo+ycWL\nF2n98GMUDytPs+YdOHv2bG51s4A8xsN0nLmOiGwXkV9uDQjpx05mZUCAHB4UMkhaXURRlLWKopxU\nFGWNoiiFb/vbyPTE1McVRWn3X9sNDg6mVKlSFC5cOEObFStW8OabI7Fan0BkKGfOKNSq1UDVtnDh\nwly9ep4jR/Zz8uQxYmNPZKjnnx02b95MyZLl0On01K7dgHPnzqnarVmzhv79h3Du3GQuX57HxInL\nGTNGXeRv2szvMD43C2q2g5YDMDV7jgUL3TfKbDYbzZt3YNOm6ly7toIdO1rTrFk70tLSVGq9O7Db\n7Vy7dg2n0+mV+h7v1g3/ia9CzJ+w7Tf8ln7LYyqeZTabjeZRHdi8rw7XZDXbjzxM02ZtMRrVM7pl\nl5s3b2YYOJkbGI1GbtzI0srCfYkDXZZLfiCnZwpqSatHAGtFJBJYl/4eRVGqAr1wCTh1AKYqipLt\n/m3fvp2iRcOJiKhASEgx1qxZo2r3v//9D6gH1ALCgCe5ePGcqq3ZbKZz5x7Url2fqlVr0rdv/3SZ\n7P9OXFwcnTp1JS6uOQ7HSA4fLkLr1h1Ub3gffTQFm+1tXMq6NXE4Puerr9wlNgA0Wi3Y/pJoR7GZ\n0KnIS58+fZpr10w4HGOASjidQzEaQzh48O7cK/npp8UEhxQlonxlHihTkUOHDnlc5xeTJvJsg+qE\nvfUYFaePYtGsGTRo4P5gcOrUKa4nWHD4RYOuEk7/4RjNhT2+ViaTiUe6P05oeDghxYvTb8ALGc4A\ncwKn08nLA9+gUOFQioeVomXrTh4r+uZH7jGXVI/J0UFBRLYASf86/BhwS1zve1wJqAG6APNFxCYi\n54DTgPqjewYYjUY6dnyMpKRHsViiSUnpQ/fuvVRzFLukL27vWjKKov6ljho1hvXrr2K3/4Tdvoil\nS4/wyScZJ56Li4tj5cqV7NuX8Wxt165daDQRQGVAj9PZjMuXL3P16lU32ytX4vlnoqU4TCaTmx3A\nO28OwfBNH9j0HZpf3idgzyKeecZdJTUwMBC7/QZwa2ZgwW6/nqEkSF5y9uxZnnn+ZUzPbMAyMpH4\nRmNp26mLxzMGvV7P11Mmc+XMKU4d3EfnzurxJwEBAdhtN0DSZwZiwW7z/FqNfC+aDWlO7DsTsW+9\nyi/HTjP5s889qjM7TJ8+gzmLtmN/8DK2qknsOFaE1werB3Dez3gyKHhrtURRlLqKohxO/1uO/kjy\nYk8hTETi01/H43pMB3iAfyanvsidk1P/g5iYGJxOX1yTDYAK6HRh/Pnnn262EyZMQKs9jytvxRrg\na/r0Uc/lu2nTdkymzrjyYvhjNHZgw4ZtqrZr164lMrIaTz45kubNO/D886+oavQULVoUh+MaLkkS\ngGScTptqQFaLFk2AGcAQ4H3gFcqXV780rw58mZmffkiX5N95qnAce7ZtoUyZMm52ERER9OjRFYOh\nA/AJBkNnWrasT40ad5Rsz3UOHDiArmwTeMCVz4I6T3HzZqrqAJoTlClThu7duxBgbgOpH2MwtadV\ny4ZUr17do3o37tiBue+roPeFgECMj7/Axh1ZErL0Cpu27MRoeA50hUHxwVJoEFu25l779woezhQ8\nXS25tWn9f8AAEakEVErP3pYj5OlGc7obUWbuT9lyjSpRogRWazJwa8M4Bas1npIl3W+gpUqV4syZ\nY7RrV4w6da7w4YejmDv3B9V6y5cvjVZ75K8u+fgcoWJF9xutiNCzZ1/S0nqSnNwPo3EQCxYsY8MG\nd5fQpk2b0qZNUwIDv8PX91cMhu8YP/4DDAaDm+24cdGEhRVG57serX4lfv5O5sz5NsPr0KtXT5Ys\n+IHvpn+TqWT47Nnf8vXXrzBo0HU+/bQPS5fO90ruZW8TERGBPe4gmJNdB+KPgMOaq8J8c2Z/y1ef\nvcSg/leZMrEvS5d4fq3Kly6Ndk96tLkI+r1bqFCqVOYf8iIVykfga93ikg8BNKYtlCmde+3fK1jw\nzXL5N15YLWmoKEo4ECQiu9LtZt/2Ga+TFzsj8YqilBCRK+kne+tx7xJwe4RZhsmpo6Oj/3rdsmVL\nWrZsCUBoaCiffDKBt98ejU5XHofjHMOHD6VCBfWEMGXKlGH16juryU6ZMoE//mhOWtpxwE5oqIkP\nPpjhZme1WrlxIwHXoLQfCCQtLYRTp07RunXrf9gqisLs2TMYMOBFTp++QMuWTzN06BC3OsE1qzh+\n/AA//fQTFouFjh07Ur58+Tv2+05oNBr69++PSrqFu4r69evTv08Pvp9WG23JOthi/mD6t9+oalrl\nFA6Hg2sJicTfSCIsIRGbzeZxKtDPJ3zItqgWGA9sBauFoqlJjN2yyUs9vjNvDR/K4p9bc/FSE9AV\nRm87wjdT1+da+znBxo0b2bhxo1frzIG9gsxWS3bcZndrtcTGP1dRLpHNVZTskONxCoqilAWWi0iN\n9PcTgQQR+VhRlBFAYREZkT51modrH6Ek8DtQ8d9BCXeKUwBX+sajR49SqVIl6tSp45XzuHnzJhs2\nbECr1dKqVSsCAgLcbJxOJ1qtP1AMl/vqGWAHo0e/xfvv/zPhkdVqpW7d5pw6VRqLpSkGwwJ69qzN\nrFnfeKW/ecmSJUuYMuX/0Ol0jBw5hDZt2nil3p07d3L+/Hlq166dYY6KnEBE6Ny9JxtibmKq9Tj+\nR1dQt7CFTWtWeSy1kZyczMaNG9FqtbRu3Vp1ppiTWCwW1q9fj8ViISoqKtdl0XMab8QpVJAjdzZM\n54xS3a09lXtgkoiE3Pb3RBEpkp4CeYeIzE0//j/gV1zRuBNEpG368ebAWyLy6H89r0wRkRwruKLr\n4gArcAF4FleezN+Bk7gW8wvfZj8K15TpONA+gzrlbiUlJUVAI/C1wA/ppZK0adPGzXbNmjUSFFRH\n4LpAgkCM+PgEyI0bNzzqg81mk+HvvCtlqtWQ6o2ayJo1azyq7xZbt26VOvWipHT56jLwtaFiNptV\n7X766ScxGIoK9BPoI/7+hWX9+vVe6UNWWbx4sTxYs6GUi6wl4z78WBwOh0f1nTlzRvyLlBC+MQsz\nRPjWJgHh5WXfvn1e6nEBOUX6/cKTe5iUkT8zLGEbvpdCY179q6i1h0te4fBt748DJdJfhwPH01+P\nAEbcZvcb0BBXINqftx3vA3zjyXllVnLa+6iPiDwgInoRiRCRWSKSKCJtRCRSRNqJyI3b7D8UkYoi\n8qCIrPagXVJTU++YhGXp0qWEh5clIKAwnTv3IDk5OVN7i8WSnodZHddTo+DKgneLQFU9HYvFgtMZ\nwN8pNg0oihabzZZh/TabDbPZnOHfAYaPepev127m/PuzONJvGF37PpmpF1RWOHHiBO06dGG/7WVi\nS8xm1pLjvPDS66q2kyd/jdHYCVfmu3qYTG344ovcm/2sX7+ep557jeNhY4ip9A0ffrWQjzPxFMsK\nFosFja8/6NKXqzRaNH6Bmf4WCsg/ZBaX4NOyCUHRb/xVssgy4Jby5jPAktuO91YURa8oSjmgErBL\nRK4ANxVFaZi+8fzUbZ/xOvkuonnPnj2Eh5chJCSU0NAwNm1SX6M9cOAAffo8y5Ur/TAaP2Lt2kT6\n9Omvamu1WunZsx8BAUEYDIE8//wrqv7kBoOB4sVLA1/imvCsAY4xYsQIN9vIyEjS0vYBU4A9wGto\nNL6q03cRYfjwURgMQQQGFqJDh24ZBpn9sHAhxuhpUL0utO2G6fEX+fkXz34/K1euxFa8JzzQBwrV\nwRQ5k59+XKhq6/rN3u4q6sxVNdPZcxdhrPwWFKkF/sUx1prCzNkLPKqzUqVKlClRFJ9Fb8DZneiW\nvEOoj51ate4oOFlAPsBDl9T5wDagsqIoFxRFeRaXnHVbRVFO4sqhPAFARI4Bi4BjuJaNBsrfT7YD\nceUEOAWcFpEspdb8L+SrQcFkMtGuXSfi41tht39EUlI3OnfuxvXr191s161bh9VaD6gCBGO19uH3\n39UnJ6NHj2XFij9xOP4Ph+Nr5s/fzJQp6q7Cf/65j2rVNPj4fEahQr+zdOkiatasqWL3J4GBDYCj\nwEigCCJ2VTfL2bNn83//twK7/RAOxzk2btTw+uvq+jz+BgMk/R2XoUu6hsHfX9U2qxgMBrT222I9\nrNfQ+6mvfY8Y8QZ+fiuAlcAK/P3XMWTInZMDeYsLF87Due9hRQ1YFwW7X+F6gvv3nx10Oh2b1qyi\nS7FkKiwdyCP+59i28XevRLUXcPfjyaDgrdUSEdkrIjXS/6Y+TfcS+SMuO52YmBjsdh9cUcoAldFq\ni3Ps2DE3Wey4uDgcjgu4lnsU4DJOp/p+1Lp1mzGZ2gKum4DR2Jo1azYxbNhQN9siRYqoSmr/m5CQ\nEFyzwtW44h/icDrnEBTknuhn3bqtpKU9BbjUOy2WV9mwQd1Tafw7o3hpWB9MTw9BezmW4D9W8eyU\nXaq2WaVXr1588OFk7H++hM2vCob4L3k/+h1V24YNG1K4SBGuJcaBWHmgZJlcfaK+nhAPoT4w/wLo\n/eCTlzFv/cXjeosWLcqPP3zneQcLuOewWO+q3Ms5Tr6aKRQvXjw9TuHWwJuG1RpPeHi4m60rt+9V\nYBIup6dP01N5uhMRURKt9m+xOB+fs5Qr55k/d7NmzWjevCoBAa3Qat8iICCK0aNHq3o1lSkTjq/v\nPm6FbSjKXkqWfEC13qf6PcnS72fxYtp53iwZyKFdO1XPPzuEhIRwcN923nqyBAOan2b+d1MyTA06\n5M13SLA8iqPEMRzhp7iYVIvoser5pHMCh6KDTs+BnwE0GnjsRdDkD/mBAvIGh12X5ZIfyHfS2RMn\nTmLs2I/RaCogEsOrrw7g44/db0q//PILTz45BJOpCa7sdQ7KlTvJ2bPuiqLnzp2jfv1mmM0lATuF\nCt1g377tFC9e3KNzcTgcLFiwgNjYWOrVq0fbtm1V7ZKTk6lfvwWXLwcBIWi1u9m2bR1Vq1ZVtc9L\nHqrbiv1x74B/uhtq6gLa1V3M6l9/zJX2R4+JZvyaHcjElaDVwvT3aHZpN1vWZF3+uID8gzdcUv2T\n1dWT1TAVKpLn0tmeku8GBZPJxMsvv8quXfupXv1BZsyYpiodkZSURMmS5TGZHEAocJmhQ19j8uRJ\nqvUmJiayZs0atFot7du3zzA/sMPhYPLkz/jtt02ULh3Ohx+O4YEH1J/qs4PRaOS3337DYrHQunXr\nu1a++qWXB/P9T0lYgmYBdvxvPs7wQQ0ZG/1urrRvNptp2aETB8+eRwkIppD5Jrs2b3DLvFfA/YE3\nBgV9QuZeibdjDS1UMCjkNpkNCiJCy5bt2LUrCbO5Fr6+x6lc2cqePdvw8fH5h+20adN4441pmM0t\ncCUo0hIcvIjkZM82JV944TXmzduF0dgXne4ooaG/c/z4/kxlvL2Jw+Fg3LgJzJ+/lODgICZNGpNh\nmtGEhAQGDhzGvn2HqVKlEt98M9njASwlJYW27bpy6PCfiNiJat6UZUsX4OvrLgGQUzgcDvbu3YvF\nYqFu3bq5HhBWwN2DNwYFzZWsy5o7SwTe84NCvtpTiImJYffufZjN/YDaWCw9OXs2nv3797vZGo1G\nnE4DrtiRCkBRrNbMYwDuhN1u57vv/ofR+CXQHrt9KGlp5Vi1apWq/b59+6hWvRGFCofT+uHHuHLl\nikftA7zzTjQTJy7lxIlodu/uRceOPVRlph0OBy1aPMIvvwRy+vTX/PprOZo0aZtpHISI3FEyPCAg\ngLbtW+AXpBAQ4k+7Di1zdUAA0Gq1NGjQgObNm2dpQLDb7XeMaSng/sXp0GW55Afy1aAgIrhSMNwa\nqBUURaP6H75Tp074+BwFdgMn8PP7hR491FVSs9uHf15WnWr7V69epXlUO46dHsBN6xI2bKlA86iO\nHt+cZs78AaPxM6AR8DgmUz9+/PEnN7vTp09z7lw8NtsXQEPs9vEkJuo5cOCAar2zZn1PYKFQfP38\naRzVVlWOHGDKF1/y6cJlJL2/loR3lvLeZ//HnB/menROOUVycjJt2nbBz8+AwVCIzz//Kq+7VMDd\niF2b9ZIPyFeDQrly5ahRowq+vguAY+j1P1OyZCFV/aPIyEj69OmNKzBwPjpdHKNHuweZZQedTkff\nvk9hMLwBbEKrnYqf35906OCucrt8+XKMxlCwDgNLd7D/yOlTJzyeLej1vsA1XAFxJ9BobuDr6+5S\np9frcThMuDbZARw4namqInM7duzgtaGjMD68GWc/I3tvVOOJvs+qtr9gyXKM/cZBmapQoTbGXu+y\nYMlyj84pp+j/7Kts2VcER0QK5qL7GDV6coZJmXIKm83G3r17OXjwYK4m2CkgG5h1WS/5gHw1KGg0\nGn7/fRUvvNCIBg2O8eSTldm2baPqjW7dunV8991CYAzwMampTejSpafHfZg58xuGDetAw4aL6dbt\nBnv2/JHu/vpPjhw5AsRB6DYIuwSFXMFwnqZlHDjwadD2A7/BoHsMRbeUp592T7JTtmxZHn64BQbD\nY8B0/P0fp3bt8qoxBVu2bMFWuheEVAeND7Ya0ezYqh4pHhIcDPHn/nqvuRJDkULqm/J5zfoN67Ea\nxoDiCz4VMWr7s2FD7qmUJiQkULNmY1q2fIqmTXvQpIn3UnwW4EXs2Sj5gPwxtN1GYGAgX3455Y52\nc+bMwW6vARRKP9KcEydGedy+Tqdj7NjRjB2buV2xYsXApy74VHMd8O8Byc96LAkxf9FyKPselB4O\nDhPOA01Zu3YtAwYM+Iedoij88stcPvvsS3bv3kbNmg0ZNmyIqhx0WFgY+uRfsYkTFA0k7ickVN37\naUL0u2xr2x5z3EkUmwXD9sWM2faHR+eUU1gsTtDvB5+yrpwClh1cuOA+gOcUQ4aM5MyZBthsnwFO\nDh16mg8++IiPPvog1/pQQBbIJzf7rJLvBgWn08mcOXPYt+8gVatWZsCAAeh07qfpmqqfxiVV7oNL\ntDX3LkeHDh14L/ozHM4E0ISCbT8+OvHYdfLPP4/BQ/Ndb7T+OEO7s3Dhj26DAoCPjw/Dh7tHZf+b\n3r17M/Xb7zm6PgpnUCRcWM7MBbNVbR966CH2bd/KwoWL0Gq19PtyF6VLl/bonMAlSjdt2jTOnIml\nSZP69OzZ0+MkNz46O5brL4BxKdhjwRFDWNjjHvc1qxw6dAKbbRSuPTAtZvOjHDhwdy613dcUDAr3\nNv37P8/PP28kLa0CBsMKli5dxcqVS9xuIL169WL+/KU4HB/iko+IpUyZ3Ms6VbNmTYoXD+Hy5crg\nUwOse2jbqU2GiWMOHTrEJ1O+Is1kZsBTvejUqZN6xRodXJ0HZUaBPRWu/4S+mmc3Zb1ez5YNv7Fs\n2TISEhKIinqLBx98MEP7yMhIRo/2XlyC3W6nWfN27D94BYezEP83bQE7du5nyqcTPKq3Rs2H2HGq\nNuITCfhgMH7JQw/V9ri/Fy5c4P0PJxJ/PZHundvzzNNPqQ5gdepU5c8/f8RqbQk48PNbTN26nref\nHQ4ePMjEiZ9jMll44YV+dOzYMVfbvyfIWLg4f5JTmtw5Vcgkn8KFCxfEzy9YYKRAtMC7EhBQYntg\nGgAAIABJREFUTA4cOOBm63Q6ZdCgIaLXG8TfP1RCQ8Pk6NGjGdY9d+5cqVcvSho2bCVLly7N0M5s\nNsubb46QmjWbSMeOPeTUqVOqduvWrZOA0JpCzf3CgyuEGvvFR2+QlJQUN9sjR46IX0CwUKiaEFxN\nfAxFZP78Bar1RrXpIPiHCcEPCr5FRRsQJosXL1a1TUpKkv79X5EaNZpJr17PytWrVzM8r7xk1apV\nomhDhMABQrE5gm8LUTQBkpqa6lG9MTExUqp0pAQVeVD8AorJU0+/KE6n06M6r1y5IqElSom2zUih\n93diKFVN3h//kaptUlKS1KrVRAICyonBUEqiojqKyWTyqP3scPjwYQkIKCLwosCbYjCEyaJFi3Kt\n/dwAL+RTYKtkvXjY3t1Q8rwD/+VLyojjx49LQEBxgTHpg0K0BAeXlz/++CPDz8TGxsqBAwfEaDRm\naDNv3jzR6YIFigoUEx+fIPntt99UbXv2fEr8/aMEpotG84aEhISr3mwXLlwoSnATobG4SiOHKLpA\nuXLliptt9yd6CZpggY8FvhG0ZaVEqfKq7V+9elVq1W8iOv8A0el9ZdToaFU7h8Mhdeo0E1/f5wU2\niI/PG1KxYq0Mk+fkJWPGjBH0tYSyTqGcCGVSBPRy+vRpj+s2m81y6NAhiYmJ8byjIvLll1+KX8On\nhcniKiNOSHBoWIb2drtdjh49KsePH/d4QMouL730mijKcwIb0suHUqNGo1ztQ07jlUFhk2S95INB\nIV95H1WoUIHixUPQajcBCWg02/H3t2aq0hkREUGtWrXwz0ReetSo97HbA4FXgRex2fSMHDnGzc5m\ns7F48UJMpo+BBjidz2K1Vmf1andJbofDgaQehivfgukUxAxBRFHd6D165AQ4XwAeBaLAMZaEqzdV\n+1qsWDH27djC8cMHib8cx/j33fsJrjiFEydisVimAS2x2T4lPl4yjFPIKna7nZcHvYFfQBCG4MKM\nfHfMrf9c/5nIyEhQ9HBrCUbRo2i0XokS9/X1pUaNGpQtW9bjusB1/k7dbb8lHwMOR8aL0lqtlqpV\nq1K5cmWP90iyi9VqQ+T2wELfOwYn3peYs1HyAflqUNDpdGzZso4WLfwoWnQxjRpZ2Lp1I4GBgR7V\ne/16EtATV1a9ikB3Ll1yz3vwt+fQ3xm57PZU1Y3u0NBQDH5l4fz3cKglXItBpzhUbatVqQyYbjti\nJrRIiJsdwLVr16jXpAVVa9ahxAOleHfM+6p2Op0OESt/76I5ETGrtp8dxk+YyJz1+7GMO41p9CG+\nmL+cb6ZN96jOzp07UzQ4HpJGg2kDmuu9aB7V4q7MJ9y1a1f0x36GP76Gk+swLOzLs/3753W3VHn+\n+acxGBbhyo67A4PhS1577bm87tbdhwcuqYqiVFYUZf9tJVlRlMGKokQrinLxtuMdb/vMSEVRTimK\nclxRlHY5f4L/Iq+nKv9lOpfbFCsWIdBfYFZ66SEVKlRVtQ0ODhOoLDBOoLcoikF1+cpkMkmxYqUF\n/ARKCfhJq1btVes8cuSI+PkVFhgiMFb0+mIyf/58Vdv2j/UQn1aDhSkO4YMrElDyQVmyZImbndPp\nlDZtHhN//0cF5oifXy+pV6+F2O32bFwZdx5q2koYstqVy3iGCC/MlY7denpUp4hrma9rtyelRq1m\nMvDVoZKWluZxnTnFwYMHpc0jXaVWwyiJfn+8x9c0J/n999+ladO2UrdulHz77fRcX8LKafDG8tFS\nyXrJpD1cD+GXgQhcAVJDVWyqAgdwuUSWxeUiqfHkHLJb8p33UU4QHBzAtWvzcOVfsAMbKVKkgZud\nxWIhJeUaUBOYDgTj51eFEydO0LRp03/YxsTEkJKSguu3URQ4xt69s7Hb7W5P69WqVWPnzk1MnPg5\nRuNVBgyYlaH30a5dO7G9OMmVSyA4jLRafdm2fSddunT5h52iKKxYsYgJEyaxa9cKataMZPTomarL\nV9mhRLGiKJePItVdDzjay0cIL1nUozoBSpUqxYsvPElsbGyeidxNnTqVrVu3UrduXYYOzdiVt0qV\nKrz8XD8SExOJiory+Jre4uDBg+zc6cqP0alTJ6+kOX344Yd5+OGHvdC7fIz3VtTa4EqleSE917La\nemEXYL64krucUxTlNNAA2OG1XtyBgkEhC1gsNlw3eguu77EyZrN7jmS9Xo9eb8Bi2YrrYSAes9lC\nyZIl3WyPHz+OXl8Bs/nWDbMqNpuTq1evqiqV1qxZkx9+mHHHvpYsGUHSmc0QWhacDvxjt1K2cxdV\nW19fX8aMUc+g9l+Z9OFY/ohqhTXuIIrdQsC5rURP3epRnSJC377Psnz5NpzOqijKGD75ZAwDB77i\npV7fmRYtOrB5836gPfPmTWbOnJ/Zv989KM9qtdKiQ1vOWJLRR0aQPGoEP86Zqyp1kh1++GEuL730\nJtARjeYAUVGzWb58Ya7mv75v8Z5Lam8gPYgIAQYpivI0Lk2aN8WVlvMB/jkAXATcbyA5SL77Rdls\nNiZN+pRevZ5i3Ljxmap+ZpXChUOAh4BHgI5ALUJD1RPs2O0OoDDQAaiOiKIqXRAZGYnNdhZISj9y\nEq02PdJZhVWrVlG1ah3Kl6/OJ598kmFfv//2K/yXDUY3thK6MRFUCUhVDVzLKapUqcLR/XuY3KsB\nU55uybEDez0OyNu+fTvLl28kLW06JtNIjMapDBkyzCvfbVY4dOgQmzdvBg4Bs4EjHDhwmLVr17rZ\nzp8/n7OKiYgtX1Ji1gjCF7zH84MGetS+iPD8869gNL6E0QipqW3ZtOlYrus03bc4MimHN8Ki6L9L\nBiiKosflKXIr29T/AeWA2riWlCZn0oNclfDNVzMFEaF7916sX38MozGS5csX8Ouvv7N58+8eTeFb\ntmzK6dN/YDZXBJz4+x+gdevebnZmsxmHw4brey+N67vsxLp16+jWrds/bKtVq0Z09EjGjBmHr28Y\ndvs1Fi9e4Jb3AWD16tV06tQd12Z3EG+99T6JiYl89NFHbrb79x8EZzB2+xA0EsvZk7O5evUqpUrl\nXmBeqVKlGDjQsxvh7cTHx6PVluVWjmwoiUbjy40bNyhRooTX2smIU6dOAcWAW9IeIaBEcOrUKbds\nefHx8ehqV0BJf4I3PBTJhXh1RdmsYjabsVhtoKwA/Uvg2IXRdJ3z5897VG8BWSSz5aPIlq5yi58z\n1LfpCOwVkWsAIvKXp4qiKP8DboWyX8K1zHCLUunHco/c3MDwRiGTjeZz586Jv38hgXfS4xRGS2Bg\nuOzZsyfDz2SF1NRUqVixioC/gL/Url1PrFarm53FYhHQCpwVuJBe2sv48eMzrPv8+fOybds2uX79\neoY21avXFegrsDa9fCC+vkVVbUuWelAI3SaEixAuogt+RT74YFz2T/ouIjY2VgyGIgJfC2wVRRki\npUtHisPh8Khem80mw4aNlDJlqkmNGo1l7dq1qnYJCQmiKAECPwg4BH4RMMjZs2fdbLdt2yZBDxSX\nqkfnyEPWjfLA60/Iw507etxP0AkBV4UgEQKdgrahfPbZZx7Vez+ANzaap0nWSwbtAQuAZ257H37b\n6yHAPPnnRrMe10ziDOnJ0HKr5KvlI4vFgqL48PcESItGo8dqtWb2sTsyd+484uKSgWeAfpw8eZ7l\ny901avR6PfXrNwNGAXHAr/j67qRPnz4Z1l26dGkaN26sqqR6C4vFCtzuVhuA0+lUtbVaLaAU+uu9\nQwqnf94zlixZQomIChiCQ+jcrRfJyVlPUegpERERLFmygCJFxqEozalUaQPr1q30eD192LCRTJ26\njvPnozl8uA9duvRm3759bnZFihRhwYKZaLWvAjo0mqeZPv1zypUr52bbuHFjPhs/gfNNX+OAoQ3l\n/kxgwSx1nais4nA40Gg1oAS5DigKfn5hFC3q+QZ+AVnAQ5VURVECcG0y/3zb4Y8VRTmkKMpBoAWu\ngQEROQYsAo4BvwID0we33CM3RyBvFDKZKdjtdqlQ4UGBugLPCzSRsLBSmUYrZ4UGDaIEnkmPKP5Y\noKc88kg3VduEhATp1OkJCQ4Ok4oVa8uWLVs8alvEFSULBoH3BCYJlJTOndXbf/vt0WIIbiSEbhEK\nzxVDQFE5ePCgR+3v27dPDIWLC/02C4Ovim+d/vJIlycytHc6nRITEyPnz5/3uoujzWbzWl2hoREC\nqwSOCRwTRXlR3n13dKafyaoMhdPp9GpfO3fuKX6BPQTDdlH8P5fChcNVo98L+Cd4Y6bwuWS95IOI\n5ny1p2C1Wrl+/TouL6GVgB8pKSmkpaVlGrF8JwIDA4DbvY1SCQpSD5wqUqQIK1Ys+s9tqfHaa6+R\nmJjIhAlf43A46dAhil9++VHVdvz4Mfj7+7Ng4TCCggOZNHExNWvW9Kj9devWYavcByKaA2BpMZl1\n35ZVtU1JSaFt264cOpSeozmqKcuWeS9Hs6fBdbfj5+cPJOJyBwedLomAAPen/39+xi/Tv99CURSv\n9nXRou8YOnQU69YPolTJcKZOXU9YmLp8eQFe5j4L8lYkl2cmnqIoimTU52PHjtGoURtSUl7661ih\nQj+wbNmMDJPXZ4WtW7fSrl0njMaGgIOAgL1s27bJ45vtvcKMGTN4fdJijF1XuqQmLu2g6NpeXItz\n3+h86aXBfP99EhbLLMCBv38Phg9vyNix3lNN9RazZ8/hlVfexmjsh1Z7hZCQDRw+vDtXNq9vkZiY\nyNq1a9FqtbRv356goKBca/t+QFEUROQ/64coiiJMyMY9coRn7d0N5KuZQvHixbFabwLJuJLnGLFa\nrxMeHq5qbzQa+eGHH0hMTKR169Y0aOAekAbQtGlTFi78gQkTJqLVahk79pdcHxCuX7/O/PnzMZvN\nPPbYY1SuXDnX2u7Tpw+ffjmNc0sfxVroQXyO/8BX33yhart79yEslncALaDFZHqSHTsW51pfs8PT\nTz9FeHgJFi9eTkhIBIMG7czVASEmJoYGDZpjNlcCbISEjGLv3q0ZuiUXkEfcZ1lS89VMAeCTTyYT\nHf0RGk1ZnM5YXnvteT7++EM3O5PJRL16TTh3zoHVGopef4gZM76md293V9PDhw/TtGkrLJbaKIoD\nf/9j7Nu3Q3WjMSe4cuUKteo35uYDzXD4haA/NJ+1q5bRuHHjXGkfXAPo3LlzSUhIyHQAffLJ5/nx\nxyBstk8Bwde3P6++WpLJk93dZ+93unTpxYoVRXE6XXEkPj6TeP75Ykyd+nke9yz/4JWZwuhs3CM/\nuPdnCvluUDCZTLz88kB27dpDtWpVmTlzOsHB7jmCZ86cyaBBn2I0PoUrSjmWIkUWk5Bw2c22Y8eu\nrF4diEgbADSapTz5ZBFmz3aPMHY4HEya9BmrV28hIqIEH330nmqEcnYY/tZIPtuZhr1L+tP53h+o\nFzuL3VvWeVRvTnD16lUaN27DtWs6RGyULx/EH3+sLlgWUaFOneYcONAXaJh+5Fc6dtzPqlU/5WW3\n8hVeGRRGZuMe+dG9Pyjkq+UjEaF9+0fZvTsBs7kOMTEnadasNXv3bncLCktKSsJuL8Lf8iNFSU1V\nd7O8fj0RERMwHtDgdEZwLYN4pJdffoN58w5iNA5Gp9vN6tXN+PPPvYSEuKuaxsTE8PLLQzl79jxN\nm9bnyy8nqd48ryYmYQ+t/veBYpEkHkxys7sbKF68OEeP7mLPnj1otVrq1aunGpBXALRp04wTJxZg\nMtUE7BgMP9O27dN53a0C/s19lnktX8UpxMTEsGfPfszmnkB1LJauxMTEs3//fjdblwjYPmAiMA5F\n+YaoqFaq9ZYp8wCwG+iKS+piJxUrukcI2+12Zs36FqMxBBiI3b6KmzfD+fXXX91sk5OTeeihJqxZ\ns5vTp2OYPftXHn64E2qzoG6dO2DYNgXiDsGNi/ivfYeuj3impZOT+Pn50axZMxo3buy1AWHv3r1U\nrVqfQoXCaN26M1euXPFKvXnJuHHRdOxYCq22NVptO558sjGDBw/K624V8G8yk7n4d8kH5LuZgkt8\n0IJL0bTYremjm61L9kLBJUpYHEVZmeEN7OLFeKAfUC39SE9iYtyXmQAcDh2uYMTvgKOYTNHEx8e7\n2S1fvpwbN24Cg4DaiCxh9+7lXL9+3W2jsWvXrky4GMeYcY9gs1jo3bs3E8ZnGE6f74iPjycqqh1G\n49tAAzZs+I6oqA6cOLE/1xPTeBNfX18WL56LxTITRVEyzM9dQB5zn7mk5qtBoVy5chQtGkxs7Hhc\ncuRWfHxCqVOnjputS8ysDuDy4nE6u7N+vbrQnMs3/fYkNyb8/d391Z1OJ4piQeQlXOKGkUB91cHm\n7NmzQHlcgY4AA4Cl3LhxQ9X7ZNBrAxn0mvf0hP4LJ0+eJDExkWrVquXqHsHy5csxGsvi0hMDeItT\np+sRHx+fq95COYW3YjgKyCEKBoV7l+vXrxMbexHoA1QALpKY+D0nT56kevXq/7ANCgrC6byKS7RO\nAZLw9VUPcBs69BU2b+6NSApgQ6NZxeDB7gqZOp0OrVaH3d4TCAcuo9MZVKWzS5cuDVzHNefUAjcA\ne4YBSRaLhQ0bNmA2m2nevHmmshjeRkR4/sVBzF+4GB9DSXSOK2xYtyrX3HLj4uLQaK/hdOzDNQMM\nA7Gpqs8WkH9JTk5m06ZNaLVaWrVqlXs5Ne6zPYU8GxQURTkH3MR1V7SJSANFUYoAC4EywDmgp7g0\nxrPE9u3bAX9cAwK4BAZDWbdundugUKhQIRyOWOAHoDiwCz+/Qqjx88/L0GrLYbefARR0utIsXbqC\nZs2a/cNOo9Hg5xdEaurbQHPgBk5nP9VcwkFBQWi1GhyOUbg0sP5ApzOQmprq5i2VmppKo0atiY11\nAIXw8XmV7ds3uHIX5wLLli1j4S+bMVU/iUkXBPHf8UTvZzlxbG+utN+qVSui358EypugeQgcW1C0\nisdeXQXcO8TGxlK/WQtMRSuC1Uxx50h2/bExd1KyWnK+ibuJvNxoFqCliNQRkVtO7yOAtSISCaxL\nf59lXE+uRlxP4OAacxJVl4/Onz+PVlsL18DhALqQlHTdzQ7g8OFj2O11gc5AJ6zWmhw4cMTNzmKx\nYDTeAG4NFoXx82uUvlT0T8qXL49eL0APXJLMz+Pjo6iKnE2a9CmnT4eTkvITKSkzSUoawIsvDsn0\nWniTEydOYAlsC7r0JaPQbsScPZ5r7fv6+qLRBEDACTCsAsN2tBp9QYKZ+4jBb40ioUF/UoatJWXk\nZi6UbEL0+FyKffFQEO9eI6//V/17l/Ax4Pv019/jcvfJMgaDIX3j8VtgJvA1Go1GdY2+WrVq+PrG\nArWAVihKMhUqqEcJ16lTC73+T8AJOPDzO0m9erXd7Hx9fQkLiwDWpx9JAPZRtWpVlTrrMGjQAAyG\nryhU6AAGw2fMmTNDdbPx9OlYLJb63LpcIvWJjb2Q4XVYuHAhXbr05ZlnXkrPBaCOxWIhOnocnTr1\nZuTI90hLc88mB+nXKuVXsLncYJXr86lUqbqqbU5w8eJFAgIb/K3+qq2OTudPYmJirvWhgLzl7PlY\nHJVd2lsoCtbI5pw5l/H/Aa9iy0bJD+SVEh9wFtiPKxXdC+nHkm77u3L7+9tVCzPi6NGjEhgYLvC0\nQJRAHylUqKJs3rzZzdbpdMrrrw8VP78gCQoqJcWKPSBHjx5VrffGjRtSp04DCQgoLgZDqDRr1jpD\n5dXdu3dLSEgJCQqqJL6+wRIdnXkug0OHDsmKFSvk3LlzGdpMnz5dAgJqCxwQOCW+vo9Lv37Pq9p+\n9dVUMRjKCXwpGs0ICQ4OU63b6XRKmzaPib//owI/iJ9fL6lXr4Vqknmn0ymDBg8XX0MRCSpaVYqX\nKCvHjh3L9Ly8yZkzZ8TfUFQw7HPlE/CbLcXDynqcT6GAe4dBQ4eLX6Puwndm4X83xVC9hUyc9Okd\nP4c3VFK7SdZLPlBJzbOIZkVRwkXksqIoxYC1uHwzl4lIyG02iSJS5F+fkzFjxvz1vmXLlrRs2RJw\nPfmWLl2Ba9fqIFITOEmhQhs5d+6U6ro+uJ5CExMTiYyMzFQB0+FwcPLkSTQaDZGRkZm6QqalpXHq\n1CnCwsIy1F3KDiJC27YdWbduKyCUKvUAhw7tVA2IK1myMnFxX+NKHwpa7QjGjIlg9OjR/7A7e/Ys\n1as3xWQ6D4oexEFgYFU2bpxH3bp1VfuR1WuVVZKTkxnw/CA2b/mDEiXC+d+3UzKUz/jpp8U8/fQA\nnE6FwiEhrFm95L4RJCzApVTQvXc/fl/zGyD07vMk303/xi2j4saNG9m4ceNf78eOHet5RPOj2bhH\nLnePaM7u/qmiKCOB59LtXxeR3M27mtejUvqgNAZ4EzgOlEg/Fg4cVxu5M+PYsWNSuXIN0Wp1UrZs\nJY+zrmWXuLg4adiwpWi1PlK0aElZuXJlhrabNm2S8PByotX6SM2aDVQzeYmI/Pbbb2IwlBCYIfCT\nGAy1ZcyYD1Rtw8LKC2wRSBBIEI1mkIwe/Z6b3YkTJ8RgiHBlElNEwClBQTVkx44d/+3E/wMtW3US\nfZFnhVInhWJzJTComMTGxmZob7PZ5OrVq17P0VDAvUNycrKkpKRk2R5vzBQ6StaLSntADFDkX8cm\nAm+lv34bmJD++lbmNR9cmu6nAY0n55Dtc87Nxm67IAYgKP11ALAVaJd+od5OPz7i1oX695d0N1On\nThPRansJ/CzwsRgMIXLixAk3u0uXLklgYJH0xDk/ikbTX8qVq6K6JPLssy8JvCVwML3MlooVa6u2\n/95774vBUEvgJ4HPJCCgqOqymMPhkDp1momv7/MC68VH/4ZUrFhLzGaz5xchC5jNZtFq9UJZi1BO\nhHIigcV7y/fff58r7ec0sbGxcuDAAY8TPBXgGV4ZFNpI1kvGg0Lov44dB8LSX5e49QAMjLx1D0x/\n/xvQyJNzyG7Jq43mMGCLoigHgJ3ACnFNkSYAbRVFOQm0Tn9/V5CWlsbatWtZv349ZrNZ1cZisXDw\n4G4cjn64kszXQFHqsXXrVjfbXbt2odFUBuoBvjid3bh8+QpXr151sw0ODkSrvV1s6Vp64h93xox5\nh7fffpzy5aOpXXsha9cuU93o1mg0rF+/nD59fKhR4z16dL/Jtm1rcy2QysfHB41WC4708xUB+yUC\nAwMz/2AWsNlsbNq0idWrV3Pz5k2P68sOIsJrr71JZGRtmjfvRdmyVTh+PPc8tQrIASzZKOoI8Lui\nKHsURXkh/ViYiNySOojHdU8EeABX5OstLgLugU45SJ7EKYhIDODmviMiifwd4nvXcOXKFRo0aEZy\nsg8idsLD/di5c4vbPoVer0ev98NsvgSUBhwoykVVN9NixYrhcFzC5bLgA1zH4TCrKroOHfo633/f\nkJQUCw5HMAbDj3zyyQLVvh49epQpU77G6Yzk6tUrjBo1jjVrlqhGVRcuXJhZs6Zm/4Lcgfj4eFat\nWoVWq+XRRx9V3fvQaDSMGRPNhx+3xqh9Dj9lN2VLWXnkkUc8attoNNKsWVtOnUpCownCz+8KO3Zs\nzDWZ85UrV/LddysxmzdjNhciNXUOjz/+DEeO7MyV9gvIATJzNU3eCDc33qmGpnLb/qmiKP94ShAR\nURQls42LghzNd5rOeYtLly5Jt27dpHnz5vLFF19kaNer11Oi07UXmCrwtej1UTJo0BBV25kzZ4nB\nUEx8fR+TgIAa0qJF+ww9erp27SUBAZHi6/uoGAwlZFIm3hSxsbHy3ntjZPjwEbJ79+4M7WrVaiow\nWeCKwEUxGKJk2rRpmVwF73Ly5EkJCX9AAro8IQEdH5OwsuUkLi4uQ/slS5bIG28Mk8mTJ0taWprH\n7Y8d+4H4+bUX2C9wUDSawdK2bReP680qEyZMEK32WYFogcEC34uvb2CutV/AP8Eby0f1JOvlDu1x\nh/1TXMvmI26z/w1o6Mk5ZLfkK5mL7HD16lXKlo3EZisNhLNly9scPHiI//1vupvtqVNnsdsfSn+n\nYLVW5Pjx06r1Pvtsf6pXr8a2bdsoUaIEPXr0cPOQAJfO++LF81i6dCmxsbHUrfuWW4T07URERDB2\nbPQdzys29hyuaGoAHUZjY06fdg+eyyneHP0eyc+9inPgmwBYPhjJmPHj+farr1Ttu3TpQpcuXbzW\n/okTMZjN9bgVguN0NuTMmfWZf8iLlC5dGqfzI6AxLm2roYSEuM8UC7iH8ED9VFEUA6AVkRRFUQJw\n7Z2OBZYBzwAfp/+7JP0jy4B5iqJ8imvZqBKw67/3IPvkdfBanjF8+HBstnBc30d74AVmzpytauuS\nzt6I69dhBTZTvnxEhnXXr1+fwYMH06tXr0yTt2s0Grp168bgwYMzHRCyw0MP1UWn+x7XjDORgIBl\nNGhQz+N6jx8/TpcuPWnc+GEmTpyE0+lUtbt0JR5n9Vp/vbfXqM1FlX2SnKJp03oYDKuANMCBXv8z\nDRuqn7+I8NVXU2nSpD2dOvXkwIEDHrd/8+ZNNJrawBfAG8A0TCb1PagC7hE8i2jO1v6piBwDFgHH\ngF+Bgekznlzjvp0p3LhxAyjK30HVIYioPxL8+ecJ4BQwHFdUs8LRo8dyo5vZ5ocfpvHww49y9mwN\nHA4TL774Gj169PCozgsXLtCwYRQpKZ0ReZBDh2Zy5cpVPv10optth5ZR/Pl/n2KqVQ9sVgwzvqLD\ns7mXOObll19i5879zJ/fFq1WT82aNZk61X32BzB+/Md89NEcjMY3gTg2bWrDvn3bPNKUSklJQaMp\nh+Ovn1IprNYC4b57Gg/kK+Q/7J+KyIeAew7h3CI316q8UfDSnsLPP/8soBd4TmCEQHUpXjxC1TYw\nsJjA4wLDBIYLdJawsDJe6UdO4HA45NKlS3Ljxg2v1Pf555+Lr297gWXpZYYEBBRWtbVarfL0iy+K\nztdXdH5+8vqwYXkSeZyQkCCXL1/ONKahePFyAmsELghcEI3mRYmOjvao3YMHD4rBUDTaxTwBAAAI\nDElEQVQ9pmSD+Pp2ku7d+3pUZwH/Hbyxp1BRsl7yQUTzfTtTKFeuHD4+CjbbfFzLQnqqVm2sahse\nHoZLQuiWu6RQrlyZ3Onof0Cj0XhVQVRRFBTl9uUiB4qivvLo4+PD99OmMXPqVBRFyTPRuqyoZ7r6\n9vfsUFEcHiftqVmzJosXz+Hll98kOTmJ9u3bMWPG1x7VWUAec5+ppN63g8K6detQlLq49n0ATGzb\n9oWq7ZQpE+jevQ9Wqw3XOvUmPvnEPcVmfuXxxx/nvffGY7UuwOkshcHwC4MGvZrpZ9Q21+823nzz\nVcaMGYzR+DqKEofBsJR+/Tx3He3QoQPnzt296VILyCb5RP00q9y3g0JISAg6XRJW660kOwkEBann\nU+jUqRNLly7iiy++QafzZdiwlV7bGL4XCA8PZ+/ebYwePY74+GN06/YGAwe+ktfd8pg333yD0NAQ\n5s9fRkhIMNHRmylfvnxed6uAu438on6aRfJMEO+/oiiKeKPPJpOJ+vWbEhNjwWotgl5/lBkzptK7\nd28v9PLeQERYvnw5R44cITIykh49etzTOY8LcOFwOJg/fz6xsbE0aNCANm3uunjQXCM9R7tngngh\n2bjfJHnW3t3AfTsogGtgmDt3LgkJCbRu3Zr69et7pd57hUGDhjJr1mLM5ur4+Z2ga9fmzJkzs2Bg\nuIdxOp107NSDrfuuYfZtim/KT4wa/hLvjHorr7uWJ3hlUAjKxv0mpWBQyHW8OSjcz8TFxVG+fBUs\nlg9xaRJaMBhGs2vXBqpVq5bX3SvgP7Jx40Yeffw1UivuB40PWC7hc6QSN5MTvSJ3fq/hlUHBPxv3\nG9O9Pyjct3sK9ztJSUno9YWwWG4J6/ni4xNKUlJSnvarAM9ISkpC41/ONSAA6B9Ao/UlJSXlvhwU\nvMJ9tqdw30Y05zQ2m427eUZTsWJFAgM1KMpaIBXYjFZ7oyBxzT1Oo0aNcN7cCQk/gy0BbdwYypWr\noCrKWEAWKcjRXIAnXLjw/+3df2jUdRzH8edb7c7N8kdWbtog2TRm0xm4JbU/oqAcCAVhofSXCEZk\nIoXQ/gj/7Y8iKjJ0Kyw1KwfKii3zj4nCRszUiRuRs4E/JkJyoShjbp/++F63c67rbvfdfe++ez3g\nYPfd97Pv57Nx997n+/7c532R6urVRKNFzJnzMM3NzUF3aVzRaJRjx36hurqPoqIGKitP0d5+ZNxd\nWqVwlJaWcqTtMItHdlDUU05tWSdHjxxWnkjSppyCz5Yvr6G39wmGh9cD5ykufp+urhNUVlYG3TWR\nKceXnEJGO1cXfk5BMwUfDQ4O0tNzmuHhDcB04HHMauno6Ai6ayIiaVFQ8FEkEmHmzFl41fcA7mB2\ngQULFqRqJiJ5bSiDR+HT6iMfmRm7d+9k06YtmD3FtGkXqKtbRn19fdBdE5EJC0kGOU3KKUyC7u5u\nOjs7KSkpYe3atYFtCicy1fmTU/g7gxZz7rqemZUBXwOP4CUndjnnPjGzHcAm4N/i6w3OudZ4m/eA\njXi7Nb7tvPoLOaOgICKh5U9QuJpBi5KxQaEEr+zmaTO7HzgJvAy8Ctxwzn005nrLgP1ADV7ltaPA\nUufc+FWtJoFuH4mIpDTxXIFz7irxqOKcu2lmvXhv9jBa4SvZS8C3zrkhoN/MzgO1QOeEO5Eh3dcQ\nEUnJn0+vmdljwJOMvsFvMbMzZtZkZnPjxxYCl5KaXWI0iOSEgoKISErZrz6K3zo6CGx1zt0EdgKL\n8Up1DgAfpuiAajSLiOSPVDOAX+OP/2Zm9wHNwF7n3CEA59y1pO83Ai3xp5eBsqTmj8aP5YwSzSIS\nWv4kms9k0KJ6bKLZgD3AX865bUnHS51zA/GvtwE1zrkNSYnmWkYTzRW5fNPTTEFEJKXb2TR+Bngd\n6DazU/FjDcB6M1uJd2voT2AzgHOux8y+B3rwpihv5vq/YM0URCS0/JkpnMigRV3B732kmYKISErh\n2L4iXQoKIiIpTa1tLhQURERS0kxBREQSNFMQEZEEzRRERCQhqyWpBSd0QcE5R0tLC2fPnmXJkiWs\nW7dO9WlFJAuaKRS0rdu389WPPzL4whqiPxzkUFsb+5qaFBhEZIKUUwiUma0BPsYrctzonPsg3bYD\nAwPsamxk5ORpbO48Bm/d4vDqVZw7d46qqqpJ67OIhNnUmink1S6pZjYd+AxYAyzD+yh4ZbrtY7EY\nkfnzsbnzvJ9XXExk4SJisdi453d1dfFi3dPUVC6l4d13GBoK5o/f3t4eyHUnm8ZVOMI4Jv/4s3V2\nociroIC3CdR551x/vMjEAbyiE2kpLy/nAQz3xee469cZ2beXaVcus2LFinvO7evro/65Z3mtv4NP\nh/6ga89Otr6x2beBZCKsL0iNq3CEcUz+yX7r7EKSb0FhEXAx6XlGBSYikQjH2lqpav2JGatWUrHn\nS9pbW5k9e/Y957a0tPDKrDtsfBBWF8M3D91m/4ED2Y9gAvr7+wO57mTTuApHGMfkn6k1U8i3nELW\nO91VVFTw2/Hj/3teNBolNjIaE2MjEJkRzK8jrC9IjatwhHFM/tGS1CCNLTBRxt2l6QB8XUn03bXk\nZzcCW6UU1tVRGlfhCOOY/LEj6A7kVF5tnW1mM4DfgeeBK3gljdY753oD7ZiIyBSRVzMF59wdM3sL\n+BlvSWqTAoKISO7k1UxBRESClW+rj0REJEAKCiIikqCgICIiCQoKIiKSoKAgIiIJCgoiIpKgoCAi\nIgkKCiIikvAPwlK5mqXLiQAAAAAASUVORK5CYII=\n",
      "text/plain": [
       "<matplotlib.figure.Figure at 0x169e9ef0>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "# scatter plot of Years versus Hits colored by Salary\n",
    "hitters.plot(kind='scatter', x='Years', y='Hits', c='Salary', colormap='jet', xlim=(0, 25), ylim=(0, 250))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 26,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "Index([u'AtBat', u'Hits', u'HmRun', u'Runs', u'RBI', u'Walks', u'Years',\n",
       "       u'League', u'Division', u'PutOuts', u'Assists', u'Errors',\n",
       "       u'NewLeague'],\n",
       "      dtype='object')"
      ]
     },
     "execution_count": 26,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# define features: exclude career statistics (which start with \"C\") and the response (Salary)\n",
    "feature_cols = hitters.columns[hitters.columns.str.startswith('C') == False].drop('Salary')\n",
    "feature_cols"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 27,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "# define X and y\n",
    "X = hitters[feature_cols]\n",
    "y = hitters.Salary"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Predicting salary with a decision tree\n",
    "\n",
    "Find the best **max_depth** for a decision tree using cross-validation:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 28,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "# list of values to try for max_depth\n",
    "max_depth_range = range(1, 21)\n",
    "\n",
    "# list to store the average RMSE for each value of max_depth\n",
    "RMSE_scores = []\n",
    "\n",
    "# use 10-fold cross-validation with each value of max_depth\n",
    "from sklearn.cross_validation import cross_val_score\n",
    "for depth in max_depth_range:\n",
    "    treereg = DecisionTreeRegressor(max_depth=depth, random_state=1)\n",
    "    MSE_scores = cross_val_score(treereg, X, y, cv=10, scoring='mean_squared_error')\n",
    "    RMSE_scores.append(np.mean(np.sqrt(-MSE_scores)))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 29,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "<matplotlib.text.Text at 0x18376a20>"
      ]
     },
     "execution_count": 29,
     "metadata": {},
     "output_type": "execute_result"
    },
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAAAYcAAAEQCAYAAABbfbiFAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAIABJREFUeJzt3XmUFPXV//H3BUWiokjcEHcBI7iwuD0qOIoSJCru/qIx\nLkk0MVFjNO4+4KMRTDTR4xJ3owZUcI0rCDqIGyqCigiCghEQcQEElXXu749vjTSz9PT0VHV193xe\n53Coru6uuswZ+vZ3u19zd0RERDK1SDsAEREpPkoOIiJSi5KDiIjUouQgIiK1KDmIiEgtSg4iIlJL\n4snBzFqa2UQze7LG+fPMrMrM2mWcu9jMppvZVDPrm3RsIiJSt7UKcI9zgClAm+oTZrYVcDDwSca5\nLsDxQBegAzDazDq7e1UBYhQRkQyJthzMbEugP3AnYBlP/R24oMbLBwAPuPsKd58FzAD2TDI+ERGp\nW9LdSv8A/gz88O3fzAYAs9393Rqv3QKYnfF4NqEFISIiBZZYcjCzQ4H57j6RqNVgZusClwADM1+a\n5TKq7SEikoIkxxz2AQ43s/5Aa2AD4D5gW+AdMwPYEphgZnsBc4CtMt6/ZXRuDWamhCEikgd3z/Zl\nvNaLE/8D7A88Wcf5mUC76LgLMAloBWwHfARYHe9xic/AgQPTDqGs6OcZH/0s4xV9dub8uV2I2Uo/\n5KFs59x9ipkNJ8xsWgmcGf2DRESkwAqSHNx9LDC2jvPb13h8NXB1IWISEZH6aYV0M1dRUZF2CGVF\nP8/46GeZLiu1nhszU2+TiEgjmVmjBqTVchARkVqUHEREpBYlBxERqUXJQUREalFyEBGRWpQcRCR2\nX3+ddgTSVEoOIhKbBQvgtNNg883hyy/TjkaaQslBRGLx5JOwyy6w7rrQuzeMHp12RNIUSg4i0iRf\nfQUnngjnngtDh8JNN8GRR8Jzz6UdmTSFkoOI5O2RR0JrYbPN4N13Yf/9w/mf/hRGjgQVMyhdhazK\nKiJlYv58+P3v4b334OGHYZ991ny+Y0dYb72QMHbbLZ0YpWnUchCRnLnDsGGhtbDDDjBpUu3EUK26\n9SClSclBRHLy2WdwxBFw9dXw1FMwZAi0bl3/6/v107hDKVNyEJGs3OFf/wrdQ7vtBhMmwB57NPy+\nAw6AN9+EJUsSD1ESoDEHEanXp5/C6afDvHkwahR065b7e9dfPySRF1+Eww5LLkZJhloOIlKLO9x+\nO/ToAfvuC2+80bjEUE3jDqVLLQcRWcPMmfDrX8PixeFb/84753+tfv3g6KPji00KRy0HEfnB3Lmw\n337Qty+8+mrTEgPArrvCt9/CjBnxxCeFo+QgUiTeeSesLP7uu3Tuv2xZ+JZ/5plw4YWwVgz9Cmbq\nWipVSg4iReK66+Ctt+Dkk6GqqvD3P/ts2GILuOSSeK+r5FCalBxEisAXX4TCdePHh/UEl19e2Pvf\nfjuMGxemrFrOW9Dn5uCDobISli+P97qSLCUHkSJw111w1FHhm/tjj8GDD4YP6kJ47TW47DJ4/HFo\n0yb+62+8MfzkJ/DKK/FfW5Kj5CCSslWr4J//DLWKADbZJKxAvvBCGDs22Xt/9hkceyzcfTd07pzc\nfbRauvQoOYik7KmnQouhR4/V53baKZS/Pu44mD49mfsuXw7HHBMWuR16aDL3qJbGuMOjj8LAgdqV\nLl9KDiIpu/lm+MMfap8/6CC46ir42c+S+YA755zQSrnssvivXdNee8Enn4SWSiG4h4H1iROhU6fw\nb/zqq8Lcu1woOYikaNq0UNb6mGPqfv43v4HDDw/jEXEO6N55Z1jgdt990KIAnwJrrQV9+oQSHIXw\n8svh3/XEE2EG2Oefh26zSy9VksiVkoNIim65BX71K1hnnfpfc8010LYt/Pa38WyeM358+Fb9+OOw\nwQZNv16uCjnucMcdYZW3GWy3XXg8YUKYFda5c/j3a4/r7MxLbKsmM/NSi1mkLkuWwDbbhD0Rttoq\n+2u//RZ69QpjEBddlP89580LxfBuugkGDMj/Ovn49FPo3j18i2/ZMrn7LFgQEsKMGWGmVE2zZsHg\nwWGTotNPh/POq/t15cbMcPecJyqr5SCSkn//O2yr2VBigLCr2pNPhvGJhx/O737Ll4eZSaedVvjE\nAOHfudlm4Rt8koYODa2U+j7wt90WbrsN3n47JJIddwwJ94svko2r1Cg5iKTAPXzQV09fzUWHDqEP\n/Xe/C/skNNaf/hS6pwYObPx745L0rCX31V1KDdlmG7j11jBovWhRWItx4YVKEtWUHERSMG4crFwJ\nBx7YuPf16BEGk484Av7739zfd8898PzzobVSiAHo+iQ97vDWW6GabGN+rltvHdaZTJwY3qskEWjM\nQSQFxx0HvXvXPYU1F9ddB/feG1YdN7Sq+c03oX//sKCuS5f87heXpUth003DtNaNNor/+qefHrqN\nmlIf6tNPwxaoDz4Ip54a9ssuB6ec0rgxByUHkQKbOzeUwp41K//ZQu5wxhkwZ07oaqqvgurnn4cB\n6OuvD9Nhi8Ehh4QZWvVN383XkiVhXGPKFGjfvunXmz0bbrwxDOKXg/vuU3IQKWqDBsH8+WEaa1Os\nWBE+aLt2hRtuqPv5gw4Ks5yuuqpp94rTDTfA5MlhbCBOd94ZVps//ni81y0XRTdbycxamtlEM3sy\nevw3M/vAzN4xs0fNbMOM115sZtPNbKqZ9U06NpFCW748VEBtzEB0fdZeO8xcGjUqDG7XdP75YZbT\nFVc0/V5x+ulPw7hD3N/x7rgjLBqUeBRiaOocYApQ/aswCujq7rsBHwIXA5hZF+B4oAvQD7jFzDRg\nLmXlscfC1MmuXeO5Xtu24dvyVVetOdB7333wzDMwbFiyawryseOOIaYPPojvmu++G7rr+vWL75rN\nXaIfvma2JdAfuBMwAHd/3t2rtzIZD2wZHQ8AHnD3Fe4+C5gB7JlkfCKF1tjpq7nYYYfQgvjlL0N3\nzYQJYWHXY4+F5FFsqneHi3PW0h13hPUbxZYIS1nS38z/AfwZqG9fq9OAZ6LjLYDZGc/NBjokF5qU\niyuvDLNxKivj76qI07vvwscfJ7MAbd994R//CNVVjzoqTM1s6v7PSerXL771Dt9/H1pIp50Wz/Uk\niGGX2LqZ2aHAfHefaGYVdTx/KbDc3YdluUyd/9UHDRr0w3FFRQUVFbUuL83E4MHw0ENw1lmhv3nz\nzUMFzr5949/RrKluvjnMMFp77WSuf+KJoWtlxYr4ZwLF7cADQ0vnu+9g3XWbdq1HHgkzsrbZJp7Y\nykVlZSWVlZX5X8DdE/kDXA18CswEPgO+Be6LnjsFeAVonfH6i4CLMh4/B+xVx3VdxN39xhvdd9jB\nfe7c8HjFCvehQ927dHHffXf3xx93X7Uq3RirLVjg3rat+2efpR1J8ejVy/3ZZ5t+nd693R9+uOnX\nKXfRZ2fOn+EFmcpqZvsD57v7YWbWD7gO2N/dv8x4TRdgGGGcoQMwGujoNQLUVFaBsADs8svhpZfC\noqdMVVVhOuNVV4Vd1i69FI4+Ot3+6BtugNdfhwceSC+GYvOXv4RVyNdfn/81pk0L9an++19o1Sq+\n2MpR0U1ljRiru4huBNYHno+muN4C4O5TgOGEmU3PAmcqC0hdHnkkFEobNap2YoBQHuKoo8LA7ODB\noS++a9cwg2flyoKHS1VV/Rv6NGdxjDvceSecfLISQxK0CE5KysiRoa/6uedC+edcuMMLL4SWxCef\nwMUXh2tk20MhTqNGwQUXhNo9xTYOkqaqqjBG9Oab+Y0XLF8eVkS//HLY7U2yK9aWg0iTjRsHv/hF\nmKKZa2KA8IHcp8/qnc8efRQ6dgylEb7/Prl4q1VPX1ViWFOLFmHiQL6thyeeCLWilBiSoeQgJWHC\nhDBuMGwY7LNP/tfZbz949tmQIMaMge23h2uvDXV5kjBrVvhme8IJyVy/1DVlvYNWRCdLyUGK3pQp\n8LOfhbITBx8czzX32CMMWo8cGco8b799KGcdt1tvDX3i660X/7XLQd++octvxYrGvW/mzLBZT7EU\nEyxHGnOQovbxx6G09ZAhoUspKZMmwfHHwwEHhNkzrVs3/ZpLl4a9Al55RV0f2fToEWZz9eqV+3su\nvxy++abugoNSN405SNmYMydUFb300mQTA0C3bmFg9KuvQtfTzJlNv+bw4eGDT4khu8bOWlq5Mmxe\npC6lZCk5SFH64ovQhfTb34ZtMQthgw3CB/pJJ8Hee4c9m5vipps0fTUXjR13ePbZMEupmMuDlAN1\nK0nRWbQodO8cckhYKJWG114L3UwnnhhqN9W3mU593nwz7PY2Y4aKwTVk+fKwO9yHH4a/G3L44WGb\nVNVSahx1K0lJ+/bbMPi8777pblDzP/8TZkhNmBBaMI3dDezmm0OLR4mhYa1aQUVF2OO6IXPmhNlf\nxx+feFjNXs7Jwcxam1mBlg1Jc7RsWZh90rFjGGhMe13AJpuELoz994eePUOpjlx8+WWYCaVvtrnL\nddzhnntCi0yzv5JXb7dStNHOEcDPgX0IicSAVcBrwFDg8UL38ahbqTytXBn+05uFKquN7cZJ2siR\nYUrqn/4Ef/5z9sR1zTUwdWr4IJPczJwZxnk++ywsjqtLVdXqvSt69ixsfOUgzm6lSqAncC2wvbu3\nd/fNge2jc3sAY5sQqwgQ/tOfdloo3zxsWPElBgiDpm++GVZnH3EELFxY9+tWrQp7KcS9oU+52267\nsDHRO+/U/5oxY8JrevQoXFzNWbbkcLC7X+ru4919WfVJd1/m7q+7+yVATEuSpLlyD3sxzJwZVi0X\nqt5RPrbaCsaODR9kPXuGRVg1Pf00bLYZ7L574eMrdQ3NWqpeEZ12d2NzUW9ycPdlZraWmU3N9ppk\nwpLmYMUK+PWv4Y03wj7ITd30pRBatQqL5AYPDh9md9yx5u5zqr6av2zjDl98EQoYnnhiYWNqzhqc\nympmTwBnu/snhQkpO405lIdvvgm7lbVqBQ8+COuvn3ZEjTdtWqj31LNn6EqaPTus8v3kk3hWWDc3\n334bqrTOmRPWnGS69tqwP/a//pVKaGUhiams7YD3zewFM3sy+vOf/EOU5q76Q7RjxzCrpxQTA8CO\nO8L48aHlsNdeYXvS005TYsjXeuuFQekXX1zzvHvYt0Erogsrl5ZDRR2n3d1TGYxWy6G0vfMOHHpo\nGGdoaNZPqXAP3UsXXhhqNGkv4/xdey189FFoiVV76aWwUv7998vj9yUtjW055LRC2sy2JWzZOdrM\n1gXWcvdv8o6yCZQcSteoUaFG0o03lucipqqq+qdhSm4mT4bDDgsFF6sTwUknhRlK556bbmylLvZu\nJTM7HRgB3Bad2hJ4LL/wpLm6++6w+9qjj5ZnYgAlhjh07RomKkyfHh4vWBBqXJ10UrpxNUe5/Dr/\nHtgP+AbA3T8EcqiAIhK6XC6/PNRIGjs2VDwVqY9ZmAVWPWtp6NAwi2njjdONqznKJTksy5yyamZr\nAerXkQYtXx5WFY8aFQrZ7bhj2hFJKahe71A9lqOB6HTkkhzGmtmlwLpmdjChi6mJxYyl3C1cGL7x\nLV4cZp/kUm1TBMIeHuPGhQJ7S5aECr1SeLkkhwuBL4D3gDOAZ4DLkgxKStsnn4SqqrvsEurglMLi\nNike7dqFsYfTTw+LJDWWk45cfuxnufvt7n5M9OcO4OykA5PS9PbbITH85jehsqpKVks++vULg9Kn\nnJJ2JM1XLiXOTgFq7tR6ah3npJl75pkwxnDbbdr4XZrmhBNCi6F9+7Qjab6ylez+OXAC0AsYl/FU\nG2CVu/dJPrw649I6hyJ0661wxRWhaunee6cdjYjU1Nh1DtlaDq8CnwEbE0p0V1/0G+DdvCOUslJV\nBZdcEtYvvPxyqLcvIqWv3uQQFdr7xMzG1SyVYWbXEAaqpRlbsADOOAPmzoVXX9VcdJFyksuAdF17\nNvSPOxApHe6hkmrXrmGK6ujRSgwi5abeloOZ/Q44E9jBzN7LeKoN8ErSgUlxmjULzjwTPv00dCVp\nfEGkPGUbkN4Q2AgYQuhCqh5zWOzuXxUmvDrj0oB0ClauDFNTBw+G886D88+HtddOOyoRyVVSVVl7\nEaqy3mNmmwDru/vMJsSZNyWHwpswIaxbaNcuzErq2DHtiESksZKoyjoIuAC4ODrVChiaV3RSUpYs\ngT/9Cfr3hz/+EZ5/XolBpLnIZUD6SGAA8C2Au88BSnTvLsnV00+HAeevvgqbrPzyl9poRaQ5yWWF\n9DJ3r7Lok8HM1ks2JEnTvHlwzjmhK+muu0IRNBFpfnJpOYwws9uAttHGP2OAO5MNSwqtqgpuvz0U\ny9thB3jvPSUGkeYs1wHpvkDf6OFId38+5xuYtQTeAma7+2Fm1g54CNgGmAUc5+4Lo9deDJwGrALO\ndvdRdVxPA9IxmzIlVMBcuTIkiF13TTsiEYlb7APSkfcI9ZVeio4b4xxgCqs3CLoIeN7dOxNaIRcB\nmFkX4HigC9APuMXMVKw3QUuXwsCB0Ls3/Pzn8MorSgwiEuQyW+nXwHjgKOBoYLyZ/SqXi5vZloTV\n1Heyep3E4cC90fG9wBHR8QDgAXdf4e6zgBnAnrn9M6Sx3GHAAJg4ESZNgt//XuW1RWS1XAakLwC6\nVy98M7MfA68Bd+Xw3n8AfwY2yDi3mbt/Hh1/DmwWHW8BvJ7xutlAhxzuIXl44gmYPRveeQfWyuW3\nQESalVy6bb4ElmQ8XhKdy8rMDgXmu/tEVrca1hANHmQbQNDgQgKWLQurnK+/XolBROqWrbbSedHh\nDEJX0uPR4wHkVrJ7H+BwM+sPtAY2MLP7gc/NbHN3n2dm7YH50evnAFtlvH/L6FwtgwYN+uG4oqKC\nioqKHMKRatdfH9YwHFxXSUURKQuVlZVUVlbm/f5stZUGsfqbu9U8dvcrcr6J2f7A+dFspb8CX7n7\nNWZ2EdDW3S+KBqSHEcYZOgCjCSU7vMa1NFupCebNg513htdeg06d0o5GRAolts1+3H1QLBFlXDL6\newgwPBrUngUcF91vipkNJ8xsWgmcqSwQv0sugVNPVWIQkexyWudQTNRyyN+ECXDooTB1Kmy4YdrR\niEghJbXOQUqceyiLceWVSgwi0jAlh2bioYfg229Dl5KISENyWQT3NzPbwMzWNrMxZvalmZ1UiOAk\nHt99BxdeGDbr0UI3EclFLi2Hvu7+DXAoYQB5B8LCNikR114Le+0VymSIiOQilyVQ1a85FHjY3ReZ\nmUaES8Snn4YWw4QJaUciIqUkl+TwpJlNBZYCvzOzTaNjKQEXXQRnngnbbpt2JCJSSnIt2f1jYKG7\nr4o2+2nj7vMSj67uWDSVNUevvgrHHRemrq6vvftEmrXYFsGZWR93H2NmRxMtYDP7YaNIBx5tUqSS\nqKqqMHV1yBAlBhFpvGzdSr0J+y0cRt0F8JQcitj994eZSSeckHYkIlKKtEK6DC1eDD/5CTz6aJil\nJCKiFdLC4MHQp48Sg4jkTy2HMjNzJuyxR9jEp4O2ShKRSKwtBzNrYWb7ND0sKZQ//xnOPVeJQUSa\npsGWg5lNcvduBYqnQWo51K+yMtROmjIFfvSjtKMRkWKSxJjDaDM7JmMaqxShVavgj3+Ev/5ViUFE\nmi6XlsMSYF1gFatXRru7b5BwbPXFo5ZDHW6/HYYODa0HpXERqamxLQcNSJeBhQvD1NVnn4Xu3dOO\nRkSKUezdStGg9Elm9r/R463NbM+mBCnxuvJKOPxwJQYRiU8u3Uq3AlXAge7+EzNrB4xy990LEWAd\n8ajlkGHaNNhvP3j/fdh007SjEZFiFVttpQx7uXt3M5sI4O5fm9naeUcosTrvvLCRjxKDiMQpl+Sw\n3Mx+2D/MzDYhtCQkZc89Bx9+GMpkiIjEKZeprDcCjwGbmtnVwCvA4ESjkgatWBEWu113HbRqlXY0\nIlJuGmw5uPu/zWwC0Cc6NcDdP0g2LGnIv/8NW24Jhx6adiQiUo4aTA5mdhUwFrjH3b9NPiTJxWOP\nhdXQWtMgIknIZbbSaUAvYG9gMTAOGOfujycfXp3xNPvZSkuXhgHoWbOgXbu0oxGRUpDYIjgz2xw4\nHjgf2MjdU9lfTMkhDET/5S8wblzakYhIqYh9KquZ3QXsBHwOvAwcDUzMO0Jpsqefhp/9LO0oRKSc\n5TJbqR0hiSwEvga+dPcViUYl9XJXchCR5OUyW+lIADPbCegHvGhmLd19y6SDk9qmToWVK2HnndOO\nRETKWS7dSocRBqR7AW2BFwiD0pKC6laDZimJSJJyWSHdD3gJuN7d5yYcjzTgqafg/PPTjkJEyl1O\ns5WimUp7AA684e7zkw4sSyzNdrbSwoWw9dYwbx6su27a0YhIKUmiZPdxwHjgWMJU1jfM7Nj8Q5R8\njRoFvXopMYhI8nLpVroM2KO6tRAV3hsDjEgyMKlNs5REpFBymcpqwBcZj7+KzkkBVVWFnd6UHESk\nEHJpOTwHjDSzYYSkcDzwbKJRSS1vvhlKZmyzTdqRiEhzkEvL4QLgNmA3YBfgNne/oKE3mVlrMxtv\nZpPMbIqZDY7O72lmb5jZRDN708z2yHjPxWY23cymmlnfPP9NZUldSiJSSDnXVsrr4mbruvt3ZrYW\nofTG+cCVwBB3H2lmhwAXuPsBZtYFGEaYFdUBGA10dveqGtdslrOVevSA66+H3r3TjkRESlFstZXM\nbAlh6mpd3N03aOji7v5ddNgKaAksAOYBG0bn2wJzouMBwANRaY5ZZjYD2BN4vaH7lLu5c0MF1n32\nSTsSEWku6k0OcVRdNbMWwNvADsA/3f19M7sIeNnMriV0a/1P9PItWDMRzCa0IJq9Z56Bvn1hrVxG\niEREYlDvmIOZtWnozQ29xt2r3L0bsCXQ28wqgLuAs919a+Bc4O5sl2gohubg6ae145uIFFa276KP\nmdk04AngLXf/GsDMfgzsDhwBdAIOaugm7r7IzJ6O3renu1e/52Hgzuh4DrBVxtu2ZHWX0xoGDRr0\nw3FFRQUVFRUNhfCDhx+G7bcPffilYNkyeOEFuOOOtCMRkVJSWVlJZWVl3u/POiBtZgcCJwD7Erp9\nAOYSBpeHunu9dzazjYGV7r7QzH4EjAT+D/grcK67jzWzPoTB6T0yBqT3ZPWAdMeao89NHZC+8EJo\n0wYuuyzvSxTU88/DwIHw6qtpRyIipSzWzX7c/QVCFdZ8tAfujcYdWgD3u/toMzsduNnM1gG+B06P\n7jXFzIYDU4CVwJlJTEvq1g0eeSTuqyZHU1hFJA2JTmVNQlNbDlOnhg/bjz6KMaiEuEOnTqErrFu3\ntKMRkVIWe+G9ctOpE8yfD4sWpR1Jwz78EJYuhd12SzsSEWluml1yaNkSdtkFJk1KO5KGPf009O+v\njX1EpPCyTWU9MON4uxrPHZVkUEnr3h0mTkw7ioZpvEFE0pKt5XBdxvGjNZ67PIFYCqZbt+JvOXzz\nDbzxBvTpk3YkItIcNbtuJSiNlsPzz8O++8L6TV6nLiLSeM0yOey8M0yfHhaYFSt1KYlImuqdympm\ni4CxhD0cegHjMp7u5e5tkw+vzrhiWf6w665wzz3Qs2cMQcWsqgq22CIsfNt++7SjEZFyEOciuAEZ\nx9fVeK7m45JT3bVUjMlhwgTYaCMlBhFJT7aqrJWZj82sFdAVmFO9n3QpK+ZBaXUpiUjask1lvc3M\ndo6ONwTeAe4DJpnZCQWKLzHFPCit5CAiacs25jDF3btEx38EKtz9CDPbHHguKsVdcHGNOSxcCFtt\nFf5u2TKGwGIybx7stFNYxb322mlHIyLlIs7yGZlzefoSSnfj7vPyjK2otG0Lm2wCM2akHcmann0W\nDjpIiUFE0pUtOSwys8PMrAewD/AcgJmtDbQuRHBJ69at+LqWnnpKG/uISPqyJYczgD8A9wB/dPfP\novN9gKeTDqwQuncvrkHp5cthzBg45JC0IxGR5i7bbKVpwE/rOP8cUSui1HXvDjfemHYUq40bBzvu\nCJtumnYkItLc1ZsczOxGwh7OdQ1guLufnVhUBVI9Y8m9OCqfapaSiBSLbIvgfgtMBoYTtgaF1Ymi\ntHYIqscW0canc+dChw7pxgIhOQwblnYUIiLZk0N74FjgOGAV8BAwwt0XFiKwQjBbPSiddnKYMSNU\nYu3ePd04REQgy4C0u3/p7v909wOAU4ANgSlmdlKhgiuEYhmUrt7Yp0WzLIUoIsWmwY8iM+sJnAP8\nAngWmJB0UIVULCuln3pK4w0iUjyyrZC+EugPfAA8CIx09xUFjK1Oca2QrjZtWpg6+vHHsV2y0RYv\nDuMfc+dCmzbpxSEi5auxK6SzJYcqYCbwXR1Pu7vvml+ITRN3cli1KqyW/vTT8HcaHnsMbrklbPAj\nIpKEOEt2ZysYXRazlSDUVdpllzDuUFGRTgyawioixSbbgPSsuv4AnwB7FyzCAkhzUNodnnlGyUFE\niku2kt3rm9l5ZnaLmZ1pZi3M7EjgfeDEwoWYvDQHpSdODPtEd+qUzv1FROqSbbbSfcAuhH0c+gCv\nA+cCJ7j74QWIrWDSTA7qUhKRYpRtQPrd6kFnM2sJfAZs4+7fFzC+uuKKdUAaYOnSsC3nggXQusD1\nZvfaC/7yl1CmW0QkKXHu57Cq+sDdVxG2B001MSSldevQrTN5cmHvO38+TJ0KvXsX9r4iIg3Jlhx2\nNbPF1X+AXTIef1OoAAsljUHp6o19WrUq7H1FRBqSrWR3EW2embw0xh003iAixUqVfCKFTg4rVoRF\nb/37F+6eIiK5UnKI7LYbvPtuWDFdCK+8AjvsAJtvXpj7iYg0hpJDpG3bsAPb9OmFuZ+6lESkmCk5\nZCjkoLSqsIpIMVNyyFCocYePPoKvv4bdd0/+XiIi+VByyFCo5PDww3DkkdrYR0SKV2IfT2bW2szG\nm9kkM5tiZoMznjvLzD4ws8lmdk3G+YvNbLqZTTWzvknFVp/qLUNjXoBdy/DhcOyxyd5DRKQpspXs\nbhJ3X2pmB7j7d2a2FvCyme0HrA0cDuzq7ivMbBMAM+sCHA90AToAo82ss7tXJRVjTVtsEfaVnjs3\nuT2lP/4MU2X4AAAKiElEQVQ47B2x//7JXF9EJA6Jdmy4e/VGQa2AlsAC4LfA4Opd5dz9i+g1A4AH\n3H1FVBp8BrBnkvHVZJZ819KIEXDUUbBWYmlZRKTpEk0OUZnvScDnwIvu/j7QGehtZq+bWaWZVQ/L\nbgHMznj7bEILoqCSTg7qUhKRUpDo99eoS6ibmW0IjDSziuieG7n73ma2BzCc+nedq7P3f9CgQT8c\nV1RUUBHjFm7du8NDD8V2uTWoS0lECqWyspLKysq8319vye64mdnlwPeEvSGGuPvY6PwMws5yvwZw\n9yHR+eeAge4+vsZ1Yi/ZnWnaNOjXD2bOjP/a11wTrnvrrfFfW0QkmzhLdjc1kI3NrG10/CPgYGAi\n8DhwYHS+M9DK3b8E/gP8PzNrZWbbAZ2AN5KKrz6dOsGXX8LChfFfe/hwOO64+K8rIhK3JLuV2gP3\nmlkLQhK6393HmNlLwN1m9h6wHPglgLtPMbPhwBRgJXBmok2EerRoAbvuGlZKx9hbxUcfwezZ2rtB\nREpDwbqV4pJ0txLAH/4QiuKde2581xwyBGbNUpeSiKSjaLqVSln1Yrg4jRihLiURKR1KDnWIezqr\nupREpNQoOdRh553DB/rSpfFcb8SIUEtJC99EpFQoOdRhnXXCrKXJk+O5nrqURKTUKDnUI66uJXUp\niUgpUnKoR1yD0qqlJCKlSMmhHnG1HEaMUC0lESk9WudQj0WLQtnuRYugZcv8rvHRR7DPPjBnjloO\nIpIurXOIyYYbwmabwfTp+V9DXUoiUqqUHLJoateSupREpFQpOWTRlEFpzVISkVKm5JBFU1oO6lIS\nkVKm5JBF9+6hOms+49/qUhKRUqbkkEX79qGE95w5jXufupREpNQpOWRhlt+4g7qURKTUKTk0IJ9x\nh+HD1aUkIqVNyaEBjU0OH30UuqHUpSQipUzJoQHVg9K5UpeSiJQDJYcGdOwIX34JCxbk9vrhw1We\nW0RKn5JDA1q0gF13za31oC4lESkXSg45yHXcobpLKd9CfSIixULJIQe5Jgd1KYlIuVByyEEug9Iz\nZqhLSUTKh5JDDrp2DR/+339f/2vUpSQi5UTJIQfrrAOdO8PkyfW/ZsQIdSmJSPlQcshRtnEHdSmJ\nSLlRcshRtuQwYgQcfbS6lESkfCg55CjboLTKc4tIuTHPZ7OCFJmZpxHzokXQoUP4O7OFMGMG7Lsv\nzJ2rloOIFC8zw90t19er5ZCjDTeEzTaDDz9c87y6lESkHCk5NEJd4w7qUhKRcqTk0Ag1k4NmKYlI\nuVJyaISag9LqUhKRcqXk0AjVW4ZWj4erS0lEypW2pGmE9u1DK2H2bFi2LMxQUpeSiJSjxFoOZtba\nzMab2SQzm2Jmg2s8f56ZVZlZu4xzF5vZdDObamZ9k4otX2arxx1US0lEylliycHdlwIHuHs3YFfg\nADPbD8DMtgIOBj6pfr2ZdQGOB7oA/YBbzKzour0yk0M5dClVVlamHUJZ0c8zPvpZpivRD193/y46\nbAW0BL6OHv8duKDGywcAD7j7CnefBcwA9kwyvnx07w6PPFI+XUr6Dxgv/Tzjo59luhJNDmbWwswm\nAZ8DL7r7FDMbAMx293drvHwLYHbG49lAhyTjy0e3bvDee+pSEpHyluiAtLtXAd3MbENgpJn1By4G\nMscTsi3nLrraHh07Qps25dGlJCJSn4LVVjKzywkf9mcB1d1NWwJzgL2AUwHcfUj0+ueAge4+vsZ1\nii5hiIiUgsbUVkosOZjZxsBKd19oZj8CRgJXuPuYjNfMBHq6+9fRgPQwwjhDB2A00DGVKnsiIs1c\nkt1K7YF7oxlHLYD7MxND5IcP/mg8YjgwBVgJnKnEICKSjpIr2S0iIskrunUE2ZhZv2iB3HQzuzDt\neEqZmc0ys3fNbKKZvZF2PKXGzO42s8/N7L2Mc+3M7Hkz+9DMRplZ2zRjLCX1/DwHmdns6Hd0opn1\nSzPGUmFmW5nZi2b2vplNNrOzo/ON+v0smeRgZi2BmwgL5LoAPzezndKNqqQ5UOHu3d296NaTlIB7\nCL+LmS4Cnnf3zsCY6LHkpq6fpwN/j35Hu7v7cynEVYpWAOe6e1dgb+D30Wdlo34/SyY5EAaqZ7j7\nLHdfATxIWDgn+ct55oKsyd3HAQtqnD4cuDc6vhc4oqBBlbB6fp6g39FGc/d57j4pOl4CfECY5NOo\n389SSg4dgE8zHhflIrkS4sBoM3vLzH6TdjBlYjN3/zw6/hzYLM1gysRZZvaOmd2lbrrGM7Ntge7A\neBr5+1lKyUEj5/Ha1927A4cQmp290g6onEQz7fQ72zT/BLYDugGfAdelG05pMbP1gUeAc9x9ceZz\nufx+llJymANslfF4K9YstyGN4O6fRX9/ATxGEdaxKkGfm9nmAGbWHpifcjwlzd3newS4E/2O5szM\n1iYkhvvd/fHodKN+P0spObwFdDKzbc2sFaGC639Sjqkkmdm6ZtYmOl6PUM7kvezvkhz8Bzg5Oj4Z\neDzLa6UB0QdYtSPR72hOzMyAu4Ap7n59xlON+v0sqXUOZnYIcD2hwutd7j64gbdIHcxsO0JrAcJC\nyKH6WTaOmT0A7A9sTOi//V/gCWA4sDUwCzjO3RemFWMpqePnORCoIHQpOTATOCOjz1zqEW2N8BLw\nLqu7ji4G3qARv58llRxERKQwSqlbSURECkTJQUREalFyEBGRWpQcRESkFiUHERGpRclBRERqUXIQ\nEZFalBxEEhLtmdEuz/eenLlCuCnXEsmHkoNIcpqywvQUYIsa11L5aikYJQcpe1E9rqlmdo+ZTTOz\noWbW18xeiXbF2iP686qZvR2d7xy991wzuys63sXM3jOz1vXc58fRDluTzewOMj7MzewXZjY+2tHs\n1mhvdcxsiZn9PXrPaDPb2MyOAXYHhkbxVN/vLDObEO3gt2OSPzMRJQdpLnYArgV+AuwIHO/u+wLn\nA5cQNkTp5e49CHV9ro7edz3Q0cyOBO4GTnf3pfXcYyDwkrvvTKhdtTVAtAvXccA+UZn0KuDE6D3r\nAm9G7xkLDHT3hwmFJk9w9x4Z9/vC3XsSSlmf3+SfiEgWa6UdgEiBzHT39wHM7H1gdHR+MrAt0Ba4\n38w6Erpw1oZQ997MTiFUBP2nu7+W5R69CNVDcfdnzGwBofXQB+gJvBUKZvIjYF70nirgoej438Cj\nGder2Y1U/dzbwFG5/KNF8qXkIM3FsozjKmB5xvFawJXAGHc/0sy2ASozXt8ZWExuOw/WNy5wr7tf\nksN7M8cpao5ZVP8bVqH/u5IwdSuJhA/lDYC50eNTf3jCbEPgBkKr4MdmdnSW67wEnBC97xBgI8IH\n/BjgGDPbJHqunZltHb2nBXBsdHwCMC46XhzFJJIKJQdpLmp+C898XAX8DRhsZm8T9gupfv7vwE3u\nPgP4FTDEzDau5x5XAL3NbDKhe+kTAHf/ALgMGGVm7wCjgM2j93wL7Glm7xH2L/i/6Py/gFtrDEhn\nxq5a+5Io7ecgkiIzW+zubdKOQ6QmtRxE0qVvZ1KU1HIQaaRo9tI5NU6/7O5npRCOSCKUHEREpBZ1\nK4mISC1KDiIiUouSg4iI1KLkICIitSg5iIhILf8fCCCVmBQwGrEAAAAASUVORK5CYII=\n",
      "text/plain": [
       "<matplotlib.figure.Figure at 0x1818f8d0>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "# plot max_depth (x-axis) versus RMSE (y-axis)\n",
    "plt.plot(max_depth_range, RMSE_scores)\n",
    "plt.xlabel('max_depth')\n",
    "plt.ylabel('RMSE (lower is better)')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 30,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(340.03416870475201, 2)"
      ]
     },
     "execution_count": 30,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# show the best RMSE and the corresponding max_depth\n",
    "sorted(zip(RMSE_scores, max_depth_range))[0]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 31,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "DecisionTreeRegressor(criterion='mse', max_depth=2, max_features=None,\n",
       "           max_leaf_nodes=None, min_samples_leaf=1, min_samples_split=2,\n",
       "           min_weight_fraction_leaf=0.0, random_state=1, splitter='best')"
      ]
     },
     "execution_count": 31,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# max_depth=2 was best, so fit a tree using that parameter\n",
    "treereg = DecisionTreeRegressor(max_depth=2, random_state=1)\n",
    "treereg.fit(X, y)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 32,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>feature</th>\n",
       "      <th>importance</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>AtBat</td>\n",
       "      <td>0.000000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>HmRun</td>\n",
       "      <td>0.000000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>Runs</td>\n",
       "      <td>0.000000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>RBI</td>\n",
       "      <td>0.000000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>Walks</td>\n",
       "      <td>0.000000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>7</th>\n",
       "      <td>League</td>\n",
       "      <td>0.000000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>8</th>\n",
       "      <td>Division</td>\n",
       "      <td>0.000000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>9</th>\n",
       "      <td>PutOuts</td>\n",
       "      <td>0.000000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>10</th>\n",
       "      <td>Assists</td>\n",
       "      <td>0.000000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>11</th>\n",
       "      <td>Errors</td>\n",
       "      <td>0.000000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>12</th>\n",
       "      <td>NewLeague</td>\n",
       "      <td>0.000000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>6</th>\n",
       "      <td>Years</td>\n",
       "      <td>0.488391</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>Hits</td>\n",
       "      <td>0.511609</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "      feature  importance\n",
       "0       AtBat    0.000000\n",
       "2       HmRun    0.000000\n",
       "3        Runs    0.000000\n",
       "4         RBI    0.000000\n",
       "5       Walks    0.000000\n",
       "7      League    0.000000\n",
       "8    Division    0.000000\n",
       "9     PutOuts    0.000000\n",
       "10    Assists    0.000000\n",
       "11     Errors    0.000000\n",
       "12  NewLeague    0.000000\n",
       "6       Years    0.488391\n",
       "1        Hits    0.511609"
      ]
     },
     "execution_count": 32,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# compute feature importances\n",
    "pd.DataFrame({'feature':feature_cols, 'importance':treereg.feature_importances_}).sort('importance')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Predicting salary with a Random Forest"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 33,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "RandomForestRegressor(bootstrap=True, criterion='mse', max_depth=None,\n",
       "           max_features='auto', max_leaf_nodes=None, min_samples_leaf=1,\n",
       "           min_samples_split=2, min_weight_fraction_leaf=0.0,\n",
       "           n_estimators=10, n_jobs=1, oob_score=False, random_state=None,\n",
       "           verbose=0, warm_start=False)"
      ]
     },
     "execution_count": 33,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "from sklearn.ensemble import RandomForestRegressor\n",
    "rfreg = RandomForestRegressor()\n",
    "rfreg"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Tuning n_estimators\n",
    "\n",
    "One important tuning parameter is **n_estimators**, which is the number of trees that should be grown. It should be a large enough value that the error seems to have \"stabilized\"."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 34,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "# list of values to try for n_estimators\n",
    "estimator_range = range(10, 310, 10)\n",
    "\n",
    "# list to store the average RMSE for each value of n_estimators\n",
    "RMSE_scores = []\n",
    "\n",
    "# use 5-fold cross-validation with each value of n_estimators (WARNING: SLOW!)\n",
    "for estimator in estimator_range:\n",
    "    rfreg = RandomForestRegressor(n_estimators=estimator, random_state=1)\n",
    "    MSE_scores = cross_val_score(rfreg, X, y, cv=5, scoring='mean_squared_error')\n",
    "    RMSE_scores.append(np.mean(np.sqrt(-MSE_scores)))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 35,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "<matplotlib.text.Text at 0x1860e710>"
      ]
     },
     "execution_count": 35,
     "metadata": {},
     "output_type": "execute_result"
    },
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAAAYoAAAEQCAYAAACugzM1AAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAIABJREFUeJzt3XucXfO9//HXO4kQJIIQkYQEoaLq2rgz4VC0aIoq2tLL\nqVMtWtqi1TalPSh9tGipc3pTP1SVut/JuItLEiGJ4JAiSCVIBJHLfH5/fNeYbczs2bNn1t6z97yf\nj8d6zJq1117rs7Iz85nvXRGBmZlZe/pUOwAzM+vZnCjMzKwoJwozMyvKicLMzIpyojAzs6KcKMzM\nrKjcEoWkVSRNljRN0kxJZ2bHz5E0S9ITkq6RtEZ2fJSk9yRNzbYL84rNzMxKpzzHUUhaNSLeldQP\nuB/4HjAAuCsimiSdBRARp0gaBdwQEVvmFpCZmXVarlVPEfFuttsf6Au8ERF3RERTdnwyMCLPGMzM\nrGtyTRSS+kiaBswDJkXEzFanfBW4ueD70Vm1U6OkXfOMzczMSpN3iaIpIrYmlRp2l9TQ/JqkHwFL\nI+Ly7NArwMiI2AY4Ebhc0sA84zMzs471q8RNImKhpJuA7YFGSUcD+wN7FZyzFFia7U+R9H/AGGBK\n4bUkeXIqM7MyRITKeV+evZ6GSBqc7Q8A9gamStoX+D5wUEQsaXV+32x/I1KSeL6ta0dE3W4//elP\nqx6Dn83P5+erv60r8ixRDAMukdSHlJAujYi7JD1Laty+QxLAQxFxLLAH8DNJy4Am4JiIeCvH+MzM\nrAS5JYqIeBLYto3jY9o5/2rg6rziMTOz8nhkdg/T0NBQ7RByU8/PBn6+Wlfvz9cVuQ64y4OkqLWY\nzcyqTRLR0xqzzcysPjhRmJlZUU4UZmZWlBOFmZkV5URhZmZFOVGYmVlRThRmZlaUE4WZmRXlRGFm\nZkU5UZiZWVFOFGZmVpQThZmZFeVEYWZmRTlRmJlZUU4UZmZWlBOFmZkV5URhZmZFOVGYmVlRThRm\nZlaUE4WZmRVVt4li4UJYsKDaUZiZ1b66TRTnnQe//nW1ozAzq311myhGjoSXX652FGZmta9uE8WI\nEfDSS9WOwsys9tV1onCJwsys6+o+UURUOxIzs9qWW6KQtIqkyZKmSZop6czs+DmSZkl6QtI1ktYo\neM+pkp6V9LSkfbpy/4EDYaWV4K23uvokZma9W26JIiKWAOMjYmvgE8B4SbsCtwNbRMRWwDPAqQCS\nxgKHAWOBfYELJXUpPrdTmJl1Xa5VTxHxbrbbH+gLvBERd0REU3Z8MjAi2z8IuCIilkXEHOA5YFxX\n7u+eT2ZmXZdropDUR9I0YB4wKSJmtjrlq8DN2f76QOGv9ZeB4V25vxu0zcy6Lu8SRVNW9TQC2F1S\nQ/Nrkn4ELI2Iy4tdoiv3d9WTmVnX9avETSJioaSbgO2BRklHA/sDexWcNhcYWfD9iOzYR0ycOPGD\n/YaGBhoaGtq874gRcP/9XQjczKxGNTY20tjY2C3XUuTUf1TSEGB5RLwlaQBwG/AzYCXgV8AeETG/\n4PyxwOWkdonhwJ3AJtEqQEmtD7Xr9tvhnHPgjju644nMzGqXJCJC5bw3zxLFMOCSrOdSH+DSiLhL\n0rOkxu07JAE8FBHHRsRMSX8HZgLLgWNLzgjtcNWTmVnX5VaiyEtnShSLFsH668Pbb4PKyqNmZvWh\nKyWKuh2ZDTBoEPTtm6YcNzOz8tR1ogB3kTUz66pekSjcTmFmVr66TxQenW1m1jV1nyhc9WRm1jW9\nIlG46snMrHy9IlG4RGFmVr66TxRuozAz65q6TxTNVU81Nq7QzKzHqPtEMWhQ+rpoUXXjMDOrVXWf\nKCRXP5mZdUXdJwpwg7aZWVf0mkThLrJmZuXpFYnCVU9mZuXrFYnCVU9mZuXrNYnCVU9mZuUpOVFI\nWkXSynkGkxeXKMzMytfuUqjZEqafBQ4HdiYlFUlaATwEXAZc29XlSivBbRRmZuVrdylUSfcC9wHX\nA9Mi4v3s+MrANsCBwK4RsXuFYm2Oq9O5KQIGDoRXXmkZgGdm1pt0ZSnUYoli5ebkUOTGHZ7T3cpJ\nFAAf+xhccw2MHZtDUGZmPVwua2ZHxPuS+kl6utg55dy0Glz9ZGZWnqKN2RGxHJgtacMKxZMbN2ib\nmZWn3cbsAmsBMyQ9AryTHYuIODC/sLqfu8iamZWnlETx4zaO9fieTq2NHAmPPlrtKMzMak+H4ygi\nohGYA/TL9h8BpuYaVQ5c9WRmVp4OE4WkbwBXARdnh0YA/8wzqDy46snMrDyljMz+FrArsAggIp4B\n1s0zqDy4RGFmVp5SEsX7hd1gJfWjBtso1lwTli2Dt9+udiRmZrWllERxj6QfAatK2ptUDXVDvmF1\nPymVKubOrXYkZma1pZREcTLwOvAkcAxwM3BaR2/KJhGcLGmapJmSzsyOHypphqQVkrYtOH+UpPck\nTc22C8t7pPa5ncLMrPNK6R57XEScB/xP8wFJJwDnFXtTRCyRND4i3s2qq+6XtCsp4UygpXG80HMR\nsU3p4XeOR2ebmXVeKSWKo9s49pVSLh4R72a7/YG+wBsR8XTWIF5xbtA2M+u8YtOMHw4cAYyWVNgm\nMRBYUMrFs6nKpwAbAxdFxMwO3jJa0lRgIXBaRNxfyn1KNWIETJnSnVc0M6t/xaqeHgReBYYA5wLN\nsw4uAqaXcvGIaAK2lrQGcJukhmzQXlteAUZGxJtZ28W1kraIiI/0U5o4ceIH+w0NDTQ0NJQSDiNG\nwPXXl3SqmVlNa2xspLGxsVuu1e404x+cIP0yIn7Q6tjZEXFyp24k/Rh4LyLOzb6fBJwUEW3+jd/e\n6+VOMw7wxBPwpS/B9JLSnJlZ/chlmvECe7dxbP+O3iRpiKTB2f6A7Dqtp/5Qq/P7ZvsbAWOA50uI\nr2Tu9WRm1nnF2ii+CRwLbCzpyYKXBgIPlHDtYcAlWTtFH+DSiLhL0gTgfFKV1k2SpkbEfsAewM8k\nLQOagGMi4q2ynqoda60F778PixfD6qt355XNzOpXsRXu1gDWBM4ijaVo/uv/7YgoqTE7D12pegLY\ndFO44QbYbLNuDMrMrIfLa4W7hRExJyK+AGwAjI+IOUAfSaPLC7X63EXWzKxzSpk9diLwA+DU7FB/\n4LIcY8qV2ynMzDqnlMbsCcBBZKvbRcRcoGZr+D0628ysc0qdPbap+RtJq+UYT+5c9WRm1jmlJIqr\nJF0MDM4WMboL+EO+YeXHVU9mZp3T4aSAEXGOpH2At4FNgR9HxB25R5YTlyjMzDqnlNljIc34OoC0\nYNGTHZzbo7mNwsysc0rp9fR1YDLwOeBgYLKkr+UdWF7WXhvefTdtZmbWsVLmenoG2Kl5kJ2ktYGH\nImLTCsTXVjxdGnAHsMkmcPPNafCdmVlvkPdcT/OBxQXfL86O1SxXP5mZla7YXE8nZbvPkaqbrs2+\nP4gSpxnvqdygbWZWumKN2QNJjdf/R5rFtbm+57qC/ZrkLrJmZqVrN1FExMQKxlFRI0fCkzXdd8vM\nrHJKaaOoO656MjMrXa9NFK56MjMrTa9NFC5RmJmVppQBd+dIGiRpJUl3SZov6UuVCC4v66yTVrl7\n771qR2Jm1vOVUqLYJyIWAZ8B5gAbA9/PM6i8STB8uEsVZmalKCVRNPeM+gzwj4hYSI13jwVXP5mZ\nlaqUSQFvkPQ0sAT4pqR1s/2a5tHZZmal6bBEERGnALsA20XEUtJKdwflHVjeXKIwMytNsSk89oqI\nuyQdTFbVJKl5QqkArqlAfLkZMQJmzqx2FGZmPV+xqqfdSavZHUDbbRI1nShGjoTbb692FGZmPV+x\nKTx+mn09umLRVJCrnszMStMrB9yBR2ebmZWqw4WLepruWLgIoKkJBgyAhQthlVW6ITAzsx4st4WL\nJPWRtHN5YfVsffqkQXdz51Y7EjOznq1oooiIJuDCCsVScW6nMDPrWCltFHdKOqSga2xJJK0iabKk\naZJmSjozO36opBmSVkjattV7TpX0rKSnJe3TmfuVw+0UZmYdK2Vk9n8BJwIrJDWPyI6IGFTsTRGx\nRNL4iHhXUj/gfkm7Ak8CE4CLC8+XNBY4DBgLDCclqE2zUk0uPDrbzKxjHSaKiFi93ItHxLvZbn+g\nL/BGRDwNqWGllYOAKyJiGTBH0nPAOODhcu/fkREjYPbsvK5uZlYfSplmvI+kL0n6Sfb9BpLGlXLx\n7L3TgHnApIgoNhZ6faDw7/uXSSWL3LjqycysY6VUPV0INAF7AqcDi7Nj23f0xqzaaGtJawC3SWqI\niMZOxNdmP9iJEyd+sN/Q0EBDQ0MnLtnCjdlmVq8aGxtpbGzslmt1OI5C0tSI2Kb5a3bsiYjYqlM3\nkn4MvBcR52bfTwJOiogp2fenAETEWdn3twI/jYjJra7TLeMoAF57DbbaCubN65bLmZn1WLmNo8gs\nldS34GbrkEoYHQU1RNLgbH8AsDcwtfVpBfvXA1+Q1F/SaGAM8EgJ8ZVt3XXhrbfg/ffzvIuZWW0r\nJVFcAPwTWFfSfwMPAGeW8L5hwN1ZG8Vk4IZsNtoJkl4CdgRuknQLQNZ+8XdgJnALcGy3FR3a0acP\nDBvmQXdmZsWUNIWHpM2BvbJv74qIWblGVTyWbs0fu+0Gv/gF7L57t13SzKzH6UrVU4eN2ZJ+DtwD\n/Dki3innJj2ZG7TNzIorperpeeAI4DFJj0j6laTP5hxXxbiLrJlZcaUshfqniPgKMB64DPg88P/y\nDqxSPDrbzKy4Ugbc/VHSg8BFpKqqg4E18w6sUlz1ZGZWXClVT2uREsRbwBvA/Gyajbrgqiczs+JK\nmetpAnzQ82lfYJKkvhExIu/gKsElCjOz4krp9XQAsFu2DQbuBu7LOa6KGToU3ngDli6F/v2rHY2Z\nWc9TylxP+wL3Ar+JiFdyjqfi+vZtGXQ3enS1ozEz63lK6fX0LdI4iu0kfUbSuvmHVVmufjIza18p\nvZ4+T5qC41DSwkKPSDo078AqyV1kzczaV0rV02nAJyPi3/DBpIB3AVflGVgluURhZta+UrrHCni9\n4PsFfHjW15rnLrJmZu0rpURxK2nRoctJCeIw0uyudWPLLeHii93zycysLaUsXCTgc8CupBXn7ouI\nf1Ygtvbi6fbZxyPgM5+BXXeFU0/t1kubmfUIXZk9tqRpxnuSPBIFwJw5sP328MgjsNFG3X55M7Oq\nyiVRSFpMO2tWAxERg8q5YVfllSgAzj4bJk2CW24B1VUrjJn1di5RdJNly2DbbeG00+Cww3K5hZlZ\nVeRVohgYEW93cOMOz+lueSYKgAcfhEMPhRkzYPDg3G5jZlZReSWKO4HZwHXAYxHxRnZ8bWB74LPA\nmIj4j7KiLlPeiQLgmGOgXz/43e9yvY2ZWcXkVvUkaU/S6na7AOtnh18B7gcui4jGcm7aFZVIFG++\nCWPHwnXXwbhxud7KzKwi3EaRg8sug3POgcceS6ULM7Na1pVEUcrI7F7piCNgnXXgvPOqHYmZWXW5\nRFHEs8/CTjvBlCmwwQYVuaWZWS5cosjJmDFwwglw3HHVjsTMrHraTRRZQ3bz/uhWr30uz6B6kh/8\nAGbPhmuvrXYkZmbVUax77NSI2Kb1flvfV1Ilq56aNTbCl7+cxlYMHFjRW5uZdQtXPeWsoQH23BN+\n8pNqR2JmVnlOFCU691y4/PLUsG1m1psUq3paSForW8BuwH0FL+8WEUUnuJC0Svb+lYH+wHURcaqk\ntYArgQ2BOcDnI+ItSaOAWcDT2SUeiohj27huxauemv35z3DhhfDww9C3b1VCMDMrS15TeDQUe2Mp\no7IlrRoR70rqRxrN/T3gQGB+RPxS0snAmhFxSpYoboiILTu4ZtUSRUSqhjr0UPj2t6sSgplZWSoy\nMltSf2ALYG7z+tkl30RalVS6OBq4GtgjIuZJWg9ojIiP1UKiAHjqKdhrL5g71yO2zax25NKYLeli\nSR/P9tcAngD+CkyTdESJgfWRNA2YB0yKiBnA0IiYl50yDxha8JbRkqZKapS0axnPk7uPfxw23BDu\nvrvakZiZVUaxv4l3i4hjsv2vALMj4rNZKeBW4PKOLh4RTcDWWaK5TdL4Vq+HpObiwSvAyIh4U9K2\nwLWStmhrGvOJEyd+sN/Q0EBDQ0NHoXSrI45IDdv77FPR25qZlayxsZHGxsZuuVap4yhuBq6KiD9n\n30+LiK07dSPpx8B7wNeBhoh4TdIwUknjY22cPwk4KSKmtDpe1aongFdegS22gFdfhVVWqWooZmYl\nyWscxUJJB2R/3e9MKkUgaSWgw1+PkoZIGpztDwD2BqYC1wNHZacdBVxbcH7fbH8jYAzwfDkPlbf1\n14dttoGbb652JGZm+StW9XQMcD6wHvCdiHg1O74XcFMJ1x4GXCKpDykhXRoRd0maCvxd0tfIusdm\n5+8OnC5pGdAEHBMRb3X2gSqlufrpc71mMhMz6608e2yZ3nwTRo2Cl16CQYOqHY2ZWXFdqXpqt0Qh\n6QIgSAPuWouIOL6cG9aLNdeEPfZIkwV++cvVjsbMLD/FGrOXAU8Bfyf1SIKWpBERcUn+4bUZV48o\nUQD87W/wl7/ArbdWOxIzs+LyGpk9BDiU1IawgjTtxlXVbjfoSYninXdg+PC0wNE661Q7GjOz9uXS\n6yki5kfERRExnjSieg1gpqQvlRdm/VltNdh/f7jqqmpHYmaWnw5nj5W0HXAC8EXgFuDxvIOqJUcc\nAVdcUe0ozMzyU6zq6Qxgf9KMrn8DbouIZRWMrU09qeoJYOnSNK7C62qbWU+WVxtFE/AC8G4bL0dE\nfKKcG3ZVT0sUAN/4BmyySVo21cysJ8orUYwq8r6IiH+Vc8Ou6omJYtIkOPFEmDq12pGYmbWtItOM\nF9xMpMWGriznhl3VExPFihWp2unOO2HzzasdjZnZR+U1zfjqkk6SdKGkY7MpwycAM4Ajyw22HvXt\nC4cd5kZtM6tPxaqergEWAQ8B+wAjgSXA8RExrWIRfjSuHleiAHj00dQD6plnQGXlbDOz/OTVRjG9\nucE6m9X1VWDDiHiv7Ei7QU9NFBGw6aZposBPfrLa0ZiZfVhe04yvaN6JiBWkJVCrmiR6MgkOP9zV\nT2ZWf4qVKFbw4a6xA0gLD0Hq9VSVOVN7aokCYNYs+I//gBdfTO0WZmY9RV5TePSNiIEFW7+CfU+s\n3YbNN4d114V77612JGZm3afDKTysc1z9ZGb1xgsXdbMXX4Rtt03ravfvX+1ozMySvBqzrQwbbJCq\noG67rdqRmJl1DyeKHLj6yczqiauecvD66zBmDMydm9asMDOrNlc99TDrrAM77QQ33FDtSMzMus6J\nIieHH55GaZuZ1TpXPeVk0SIYORJeeAHWWqva0ZhZb+eqpx5o0CDYe2+45ppqR2Jm1jVOFDk64ghX\nP5lZ7XOiyNH++8Nzz9VWo/Z778G//13tKMysJ3GiyNEqq8CVV8LXvw7PP1/taDq2fDlMmACf+ATM\nnFntaMysp+hX7QDq3U47wWmnwcEHw4MPwoAB1Y6ofSedBE1NcPbZaRbc22+Hj3+82lGZ9QzvvAPX\nXZeqk199NXVWad5GjGjZHz4cVlqp2tF2r9wShaRVgHuAlYH+wHURcaqktYArgQ2BOaT1t9/K3nMq\n8FXSWhjHR8TtecVXSd/+Njz0EHzrW/DHP/bMFfB+//s07cjDD8Pgwek/+t57p2Sx5ZbVjs6sOpYv\nhzvvhMsuS1XIO+0ERx6ZFil76SV4+eX09fHH09eXXoJ582DIkJbEsd56sMYaqYNLe1+bt37d+Bv5\n3Xfh2WfTqpvPPNO1a+XaPVbSqhHxrqR+wP3A94ADgfkR8UtJJwNrRsQpksYClwOfBIYDdwKbRkRT\nq2vWRPfY1hYvhh12gO9+N1VF9SR3353GfTzwAGyyScvxK6+E73wHbr0VttqqevGZVVJEWtr4ssvS\nz8CGG8IXvwif/zwMHdrx+5cvh9dea0kcr72WussvWgQLF350v/DYaqvBsGGw/vppGz68Zb95Gzas\npWZixQqYM6clGcye3bL/+uuw8caw2WYpsZ11Vg5LoXYnSauSShdHA1cDe0TEPEnrAY0R8bGsNNEU\nEWdn77kVmBgRD7e6Vk0mCkgf4m67wS23wHbbVTua5JlnUkx/+xuMH//R1//xj1QiuuUW2Gabysdn\nVinPPpuSw2WXpVL/kUemnotjxlTm/hEpWbz6app9eu7c9LWtbbXVYM010zlDh7Ykg003bdnfYIMP\nL6DWlXEUubZRSOoDTAE2Bi6KiBmShkbEvOyUeUBzjl4fKEwKL5NKFnVjs83gwgvhkENSUbXaA/He\nfBMOOAB+/vO2kwSkWPv2hf32g5tu6jkJzqyrFi2C++5LJeq7706/oA87LLVBbL995auIpVQVtcYa\n8LGPtX9eBLzxBixYkNpGVl01/9hyTRRZtdHWktYAbpM0vtXrIalY8aA2iw5FHHJIagf44hfhxhuh\nT5X6nS1bBocemrrw/ud/Fj93woQU5/77p5g/+cnKxGi924IFqfdd4TZnTqoKGjsWttiiZVtzzY6v\n9847qXp10qS0PfUUjBsHe+4Jv/1tqhruzjaCvEiw9tppq5SK/LNExEJJNwHbAfMkrRcRr0kaBjT3\n2p8LjCx424js2EdMnDjxg/2GhgYaGhryCDs3Z54Je+2V/pL/yU+qE8N3vgMrrwznnlva+QcdlJLF\nZz4D11+ffqjMusO8eTBjRksymDUrfX3//ZQQmrf99ktJ4l//SudPngx/+lM6d+DADyeOsWNTldHM\nmSkp3H03TJ2aqk/Hj4ezzoIdd0xd2OtVY2MjjY2N3XKt3NooJA0BlkfEW5IGALcBPwM+BSyIiLMl\nnQIMbtWYPY6WxuxNWjdI1HIbRaFXX03F2z/9CT71qcre+7e/hYsuSj2xBnVy9fObb4ajj07dBHfa\nKZfwrM5FpF/gV1+dprh58cXUDbswKYwdmxptS6n+iUiNxjNmfHh75plUhTN+fNp22aV3T/vflTaK\nPBPFlsAlpEF9fYBLI+KcrHvs34EN+Gj32B+SuscuB06IiI+sE1cviQLg3ntTT4rJk9NfSpVw++1w\n1FGpCL7RRuVd49Zb4ctfhmuvhZ137t74rGtmzUp/ia+zTuqiufLK1Y4oiYApU1JiuPrq1HXzc59L\n2y67fLjR1fLRIxNFXuopUQD86lepx9H99+f/Qz1rFuyxR/pB3W23rl3r9ttTO8vvfw+jR6cf9I62\n1VevTMNbb7NsGfzzn3D++Wm24rXXTl0j589P3Sibk0ZbXzfbLK3xnsdA0Kam1B7XXHLo2zcNPD34\n4NTO1RPHE9UzJ4oaFpEalYcMSb9087JgQWpXOO20VHXUHe68E370o/QX7IoVHW/Ll8Mdd6Qqt95u\n8eKUOLti/nz43/9NPelGj4YTTkhtSc0NshGpj/78+S2Jo/Drv/+dqmhmzUpVNOPGpf8j48al7zvT\n0aKpKY0XeP75tE2enJLX2munUsPBB6eBm04O1eNEUeMWLUo/nKeemqqFutvSpbDPPukev/xl91+/\nVNdem0anP/RQ6uPdG913H5xxRmpcHT06leyat403Lu0X6fTpqfRw9dWpR9pxx3VtjMuSJamh95FH\n0i/4Rx5JiWT77VsSxw47pPasF15oSQaF2wsvpG6dG22Uti23TLFtumn5cVn3cqKoAzNmQENDGuyz\nzz7dd90lS1J7wpIl6S+8atcF//rXqQH/gQc635BeqyJSz5szzkg9dk49Fb70pTTA67770nbvvem8\nwsSx5ZYtf9WvWJF6m51/fmqkPfZY+MY3UvVRHubPT6OTmxPH5MmpXWHUqJZksNFGKblttFFKer25\nobgWOFHUifvvT9VQp5/e8diGUsyfD5/9bBr2f8klPWNCwog00vu559KYjHqbPK1QRJo/64wz0mfx\nwx+mkb5tPXNE+qu8MHG8/npq6N18c7jqqvQ5Hn98qsap9L9bRNqqNe7Hus6Joo4891wa2DZhQhpv\nUe4P5rPPpuscfDD893/3rB/w5cvhwANT9dNFF9VfvXVESoJnnJEGeZ12Wurd1tnS3GuvpT8epk9P\nI+g90NG6womizixYkBLFuuvCpZd2viTwwAMpQZx+eqqe6Inefht23TVVwXzve9WOpns0NaXqvZ//\nPCWLH/+4ZVS7WbU5UdSh99+Hr30tlTCuu660WSshzXZ53HEpwVR6IF9nvfxyGrT3m9+kxFaL3nwz\n1d9PnpyqhwYMSAnigAPqr6Rktc2Jok5FwM9+Bn/9a6rKGDu2+Llnn526St54Y1qlrhZMmQL77pti\nHjeu2tEUt3x5mh/o4YdbtldeSb2DdtwxTcuy555OENYzOVHUuUsvTavPXXFF+mXU2rJlqRfMY4+l\nX7jDa2zO3RtvhGOOSVVmo0ZVO5oWCxem3krNSeHxx9NCNDvu2LJtsUX1e5KZlcKJohe4557UIHrm\nmfDVr7YcX7gwHe/XL43wHjiwejF2xfnnw8UXp2QxeHB1Y5k9Gy64IE03PW5cmqZkxx3TfrVjMyuX\nE0UvMXs2fPrTKTH8/Oepjv/Tn4bdd4fzzquNKZKLOf74NEr45psr3/2zqSnNYXXBBak67BvfgP/6\nr9ornZm1x4miF3n99TQ2YsiQVBVy4olpedV6qBdfsSI929ChaWqKSjzTokVpjMkFF6QpNU44IS1e\nU8/TT1vv5ETRyyxZkrqU7rVX6n5ZTxYvTiWkww6Dk0/O7z7PPpumW7/0Uth771Sa2Xnn+ki4Zm1x\norC6MnduGmOx996pJ1cpq5eV6p570nxXjz6aRr9/85tpOUmzeteVROGhQNbjDB8O06aldoottoC/\n/z11/+2K559Ps5gefXQas/Gvf8EvfuEkYVYKlyisR3vwwdSwPHo0/O53nZ91dtGiNIXJH/6Q2nNO\nPNHtD9Y7uURhdWvnnVMvpB12SAvsnHdeavTuyIoV8Mc/pnUVXnstzZf0wx86SZiVwyUKqxmzZ6eB\nee+8k3pFbb112+fdd1/qvTRgQJoexJPpmblEYb3EZpulkdLf/GZas+Pkk9MaCc3mzEljTI48Er7/\n/TTzqpOhd+fZAAAIH0lEQVSEWdc5UVhNkdLI9CefhBdfTIv73HBDmsp7u+1S4/fTT8Phh7urq1l3\ncdWT1bSbb07zYG27LZx1VpqLycw+yuMozMysKLdRmJlZbpwozMysKCcKMzMryonCzMyKcqIwM7Oi\nnCjMzKyo3BKFpJGSJkmaIekpScdnx7eS9JCk6ZKulzQwOz5K0nuSpmbbhXnFZmZmpcuzRLEM+G5E\nbAHsCHxL0ubAH4AfRMQngH8C3y94z3MRsU22HZtjbD1WY2NjtUPITT0/G/j5al29P19X5JYoIuK1\niJiW7S8GZgHDgTERcV922p3AwXnFUIvq+T9rPT8b+PlqXb0/X1dUpI1C0ihgG2AyMEPSQdlLhwKF\nky6MzqqdGiXtWonYzMysuNwThaTVgX8AJ0TE28BXgWMlPQasDizNTn0FGBkR2wAnApc3t1+YmVn1\n5DrXk6SVgBuBWyLiN228vilwaUTs0MZrk4CTImJKq+Oe6MnMrAzlzvXUr7sDaSZJwB+BmYVJQtI6\nEfG6pD7AacBF2fEhwJsRsULSRsAY4PnW1y33Qc3MrDy5JQpgF+CLwHRJU7NjPwTGSPpW9v3VEfGX\nbH934HRJy4Am4JiIeCvH+MzMrAQ1N824mZlVVs2MzJa0r6SnJT0r6eRqx9MdJM3JBh5OlfRIdmwt\nSXdIekbS7ZIGVzvOUkn6k6R5kp4sONbu80g6Nfs8n5a0T3WiLl07zzdR0ssFA0X3K3itZp6vyADZ\nuvj8ijxfvXx+q0iaLGmapJmSzsyOd8/nFxE9fgP6As8Bo4CVgGnA5tWOqxue6wVgrVbHfkkakAhw\nMnBWtePsxPPsRuoG/WRHzwOMzT7HlbLP9TmgT7WfoYzn+ylwYhvn1tTzAesBW2f7qwOzgc3r5fMr\n8nx18fllMa+afe0HPAzs2l2fX62UKMaRRm3PiYhlwN+Agzp4T61o3Th/IHBJtn8J8NnKhlO+SAMp\n32x1uL3nOQi4IiKWRcQc0n/UcZWIs1ztPB989DOEGnu+aH+AbF18fkWeD+rg8wOIiHez3f6kP67f\npJs+v1pJFMOBlwq+f5mWD7mWBXCnpMck/Wd2bGhEzMv25wFDqxNat2nvedYnfY7NavkzPU7SE5L+\nWFC0r9nnazVAtu4+v4Lnezg7VBefn6Q+kqaRPqdJETGDbvr8aiVR1GuL+y6RBhjuR5oLa7fCFyOV\nEevm2Ut4nlp81ouA0cDWwKvAr4qc2+OfLxsgezUtA2Q/UA+fX6sBwIupo88vIpoiYmtgBLC7pPGt\nXi/786uVRDGXD0/1MZIPZ8OaFBGvZl9fJ02QOA6YJ2k9AEnDgH9XL8Ju0d7ztP5MR2THakpE/Dsy\npAkvm4vvNfd82QDZq0mDYK/NDtfN51fwfP+v+fnq6fNrFhELgZuA7eimz69WEsVjpPEXoyT1Bw4D\nrq9yTF0iaVW1TLG+GrAP8CTpuY7KTjsKuLbtK9SM9p7neuALkvpLGk0aYPlIFeLrkuyHr9kE0mcI\nNfZ87Q2QpU4+vyIDgOvl8xvSXG0maQCwNzCV7vr8qt1S34kW/f1IPRWeA06tdjzd8DyjSb0OpgFP\nNT8TsBZpVt1ngNuBwdWOtRPPdAVpzq6lpDalrxR7HtIAzOeAp4FPVTv+Mp7vq8BfgenAE9kP4dBa\nfD5SD5mm7P/j1Gzbt14+v3aeb786+vy2BKZkzzcd+H52vFs+Pw+4MzOzomql6snMzKrEicLMzIpy\nojAzs6KcKMzMrCgnCjMzK8qJwszMinKiMDOzopwozEogaatWaxUcoG5aF0XSd7LRtGY9kgfcmZVA\n0tHAdhFxXA7XfgHYPiIWdOI9fSKiqbtjMWuLSxRWV7L5wGZJ+p9sJbPbJK3SzrkbS7olm+b9Xkmb\nZccPlfRktlpYYzaZ3OnAYdkqaJ+XdLSkC7Lz/yLpQkkPSfo/SQ2SLslWGvtzwf0ulPRoFtfE7Njx\npCmfJ0m6Kzt2uNLKh09KOqvg/YslnZtNJb2TpLOyFduekHROPv+iZtTOXE/evJWykVbrWgZ8Ivv+\nSuDIds69C9gk298BuCvbnw4My/YHZV+PAs4veO9RwAXZ/l+Ay7P9A4FFwBakBXEeA7bKXlsz+9oX\nmAR8PPv+g5UOSUnjX8Da2Xl3AQdlrzUBh2T7awNPF8QzqNr/9t7qd3OJwurRCxExPdt/nJQ8PiRb\nl2An4CpJU4Hfk5bLBHgAuETS10nLSkL6pd/WSmiQ5vG/Idt/CngtImZERAAzCu5/mKTHSZO3bUFa\njrK1T5IWnVkQESuAy4Dds9dWkKbJBlgILMkW25kAvNdObGZd1q/jU8xqzvsF+yuAthqK+wBvRVo4\n6kMi4puSxgGfBh6XtF0J91yafW1qdf8moG82lfNJpLaIhVmVVFtVYsGHE5JoWVBmSZZ8iIjlWYx7\nAYcA3872zbqdSxTWK0XEIuAFSYdAWq9A0iey/Y0j4pGI+CnwOmlRl0XAwIJLtFe6aIuy974DLJI0\nlDTFdbO3gUHZ/qPAHpLWltQX+AJwz0cumNYwGRwRtwAnAlt1Ih6zTnGJwupR66587XXtOxK4SNJp\nwEqk9SamA7+UNIb0C/7OiJgu6SXglKya6szsmoXXbW8f0iqU07P3Pk1ay+L+gtf/B7hV0tyI2EvS\nKaQ2DAE3RkRztVbhdQcC12UN9QK+284zmnWZu8eamVlRrnoyM7OiXPVkdU/Sb4FdWh3+TURcUo14\nzGqNq57MzKwoVz2ZmVlRThRmZlaUE4WZmRXlRGFmZkU5UZiZWVH/Hzo6x7t7BAjyAAAAAElFTkSu\nQmCC\n",
      "text/plain": [
       "<matplotlib.figure.Figure at 0x18154e10>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "# plot n_estimators (x-axis) versus RMSE (y-axis)\n",
    "plt.plot(estimator_range, RMSE_scores)\n",
    "plt.xlabel('n_estimators')\n",
    "plt.ylabel('RMSE (lower is better)')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Tuning max_features\n",
    "\n",
    "The other important tuning parameter is **max_features**, which is the number of features that should be considered at each split."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 36,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "# list of values to try for max_features\n",
    "feature_range = range(1, len(feature_cols)+1)\n",
    "\n",
    "# list to store the average RMSE for each value of max_features\n",
    "RMSE_scores = []\n",
    "\n",
    "# use 10-fold cross-validation with each value of max_features (WARNING: SLOW!)\n",
    "for feature in feature_range:\n",
    "    rfreg = RandomForestRegressor(n_estimators=150, max_features=feature, random_state=1)\n",
    "    MSE_scores = cross_val_score(rfreg, X, y, cv=10, scoring='mean_squared_error')\n",
    "    RMSE_scores.append(np.mean(np.sqrt(-MSE_scores)))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 37,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "<matplotlib.text.Text at 0x18d5a320>"
      ]
     },
     "execution_count": 37,
     "metadata": {},
     "output_type": "execute_result"
    },
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAAAYcAAAEQCAYAAABbfbiFAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAIABJREFUeJzt3Xm8nPPd//HXOxtiSWwhiIqtxE6llrQ9VCJFUNqiy4/q\n4q7W1irCTeLW2qt3UW3vokVraS2pNUTqWKpCSYLEFgQJja0kCNk+vz++15Fx5mTOnGXONTPn/Xw8\n5pHruuZaPknOmc98d0UEZmZmhXrkHYCZmVUfJwczMyvi5GBmZkWcHMzMrIiTg5mZFXFyMDOzIhVL\nDpKWlzRJ0hRJ0yWd1ez9n0haImm1gmOjJT0n6WlJIyoVm5mZldarUjeOiA8l7RYRH0jqBTwgaVhE\nPCBpEDAceKnpfElDgIOAIcC6wN2SNo2IJZWK0czMWlbRaqWI+CDb7AP0BN7O9i8ATmh2+n7ANRGx\nMCJmAjOAoZWMz8zMWlbR5CCph6QpwBzgnoiYLmk/YFZEPN7s9HWAWQX7s0glCDMz62IVq1YCyKqE\ntpXUD7hT0l7AaKCwPUGlblHJ+MzMrGUVTQ5NIuJdSbcB2wODgamSANYDHpX0WWA2MKjgsvWyY58g\nyQnDzKwdIqLUl/FPqGRvpTUk9c+2VyA1QP8zItaKiMERMZhUdbR9RMwBbgYOltRH0mBgE+Dhlu4d\nETX7GjNmTO4xdMfYHX/+L8ef76utKllyGAhcIakHKQldFRETm53zccSR2iP+AkwHFgFHRnv+RmZm\n1mGV7Mr6BKkaqdQ5GzbbPxM4s1IxmZlZeTxCuos1NDTkHUK71XLs4Pjz5vhri2qt5kaSa5vMzNpI\nElENDdJmZla7nBzMzKyIk4OZmRVxcjAzsyJODmZmVsTJwczMijg5mJlZEScHMzMr4uRgZmZFnBzM\nzKyIk4OZmRVxcjAzsyJODmZmVsTJwczMijg5mJlZEScHMzMr4uRgZmZFnBzMzKyIk4OZmRVxcjAz\nsyJODmZmVsTJwczMijg5mJlZEScHMzMr4uRgZmZFnBzMzKxIxZKDpOUlTZI0RdJ0SWdlx8+QNDU7\nPlHSoIJrRkt6TtLTkkZUKrZlufJKmD27q59qZlZ9KpYcIuJDYLeI2BbYGthN0jDg3IjYJjs+DhgD\nIGkIcBAwBBgJXCKpS0s2EyfCuHFd+UQzs+pU0Q/fiPgg2+wD9ATejoh5BaesBLyZbe8HXBMRCyNi\nJjADGFrJ+JobNQpuuaUrn2hmVp0qmhwk9ZA0BZgD3BMR07PjP5f0MnAYcFZ2+jrArILLZwHrVjK+\n5kaMgH/8A+bNa/1cM7N6VumSw5Ks+mg94POSGrLjp0TE+sAfgP8tdYtKxtfcKqvALrvAXXd15VPN\nzKpPr654SES8K+k24DNAY8FbVwO3Z9uzgUEF762XHSsyduzYj7cbGhpoaGjotFibqpYOPLDTbmlm\n1uUaGxtpbGxs9/WKqMyXc0lrAIsi4h1JKwB3AqcDL0XEjOyco4ChEfGtrEH6alI7w7rA3cDG0SxA\nSc0PdaqXXoIdd4TXXoOePSv2GDOzLiWJiFC551ey5DAQuCLrcdQDuCoiJkq6XtKngcXA88APACJi\nuqS/ANOBRcCRFc0Cy/CpT8Haa8OkSamKycysO6pYyaFSKl1yADjlFFiyBM46q/VzzcxqQVtLDh4h\n3YJRo+Dmm/OOwswsP04OLRg6FN56C154Ie9IzMzy4eTQgh49YO+9PSDOzLovJ4dl8GhpM+vO3CC9\nDO+/DwMHwiuvQL9+FX+cmVlFuUG6k6y4IgwbBnfemXckZmZdz8mhBFctmVl35WqlEmbNgm22gTlz\noFeXTDRiZlYZrlbqROutl0ZMP/hg3pGYmXUtJ4dWuGrJzLojJ4dWODmYWXfk5NCK7beHuXPhuefy\njsTMrOs4ObSiRw/YZx+XHsyse3FyKMO++3oiPjPrXtyVtQzz58Naa6WFgFZdtUsfbWbWKdyVtQJW\nWAEaGuCOO/KOxMysa5SdHCQtL2m5SgZTzdxrycy6k2VWK2XLe+4PHALsQkokIi3v+U/gz8C4rq7j\nyaNaCdKa0ltskUZL9+7d5Y83M+uQzqxWagR2AM4HNoyIgRGxNrBhdmxH4N4OxFpTBg6EjTaCBx7I\nOxIzs8orVXJYLiI+KnlxGed0trxKDgBnnAH/+Q9ccEEujzcza7dOKzlExEeSekl6utQ5bQ2wljWt\nLV1jHbzMzNqsZIN0RCwCnpH0qS6Kp6ptsw0sWABPLzNdmpnVh3Imol4NmCbpYeD97FhExL6VC6s6\nSUtHS2++ed7RmJlVTquD4CQ1tHA4IiKXxug82xwgjXU480y4//7cQjAza7O2tjmUNUJa0gbAxhFx\nt6S+QK+ImNvuKDsg7+Tw4YdptPQLL8Dqq+cWhplZm3T6CGlJ3wf+CvwuO7QecFP7wqt9yy8Pu+8O\nt9+edyRmZpVTzgjpHwLDgLkAEfEsMKCSQVU7T8RnZvWunOTwUWGXVUm9gG7dmXPvvWHChNRzycys\nHpWTHO6VdArQV9JwUhVTt55laMCA1Fvp3m4zPtzMuptyksOJwBvAE8ARwO3Af7d2UTZR3yRJUyRN\nl3RWdvw8SU9JmirpRkn9Cq4ZLek5SU9LGtG+v1LX8ER8ZlbPyunKekxE/Kq1Y8u4tm9EfJBVRT0A\nHA+sAEyMiCWSzgaIiJMkDQGuJs3ZtC5wN7BpRCxpds9ceys1eeKJ1Pbwwgtp/IOZWTWrxHoOh7Vw\n7Nvl3DwiPsg2+wA9gbcjYkLBB/4kUu8ngP2AayJiYUTMBGYAQ8t5Th623DJNozFtWt6RmJl1vmWO\nkJZ0CPB1YLCkwgqUlYG3yrl5Nu33Y8BGwG8iYnqzUw4Hrsm21wEeKnhvFqkEUZWkpb2Wttwy72jM\nzDpXqekzHgReA9YgTdHdVByZCzxezs2zEsK2WbvCnZIaIqIRIGvkXhARV5e6RUsHx44d+/F2Q0MD\nDQ0N5YTT6UaNgtNOg5NPzuXxZmbL1NjYSGNjY7uvL6fN4dyIOKHZsXMi4sQ2PUg6FZgfEedLOgz4\nHvDFiPgwe/8kgIg4O9sfD4yJiEnN7lMVbQ6QurIOGADPPpv+NDOrVpVocxjewrG9yghkDUn9s+0V\nsvtMljQS+CmwX1NiyNwMHCypj6TBwCbAw2XEl5s+fWD4cLjttrwjMTPrXKXaHH4AHAlsJOmJgrdW\nBv5Rxr0HAldk7Q49gKsiYqKk50gN1BOUuvn8MyKOjIjpkv4CTAcWAUdWTRGhhFGjYNw4+HZZTfRm\nZrWh1Epw/YBVgbNJYx2aiiPzIqKsBulKqKZqJYA330zLh86Zk+ZdMjOrRp25Ety7ETEzIg4G1gd2\ny7qY9siqfQxYYw3YemvoQLuPmVnVKWdW1rHACcDo7FAf4M8VjKnmNC0famZWL8ppkP4yaYDa+wAR\nMRtYqZJB1ZpRo+DWW722tJnVj3JnZf14CgtJK1Ywnpq02Wap59LUqXlHYmbWOcpJDn+V9Dugf7bw\nz0Tg0sqGVVskT8RnZvWl3GVCRwBNs6TeGRETKhpV6ViqqrdSk7//HU46CR6u6pEZZtZdVWoN6YGk\nSfACeDgi/t3+EDumWpPDwoVpbelp02DgwLyjMTP7pEqsIf1d0uypBwAHApMkfaf9Idan3r1hzz1T\nw7SZWa0rZ26lZ4Gdmwa+SVqdNKp50y6Ir6V4qrLkAHD11XDtte7WambVpxJzK70JvFew/152zJr5\n0pfSYLj58/OOxMysY0rNrfSTbHMGqSppXLa/H2VO2d3drLoqbL89TJwI++yTdzRmZu1XquSwMmmw\n2/PAOFJjdAB/A16ofGi1yV1azawelNVbqZpUc5sDpLUddtsNZs3y2tJmVj0q0eZgbbDpprDyyvDY\nY3lHYmbWfk4OFeCJ+Mys1jk5VIDbHcys1pUzCO48SatI6i1poqQ3JX2rK4KrVbvsAi+9lNodzMxq\nUTklhxERMRfYB5gJbERaA9qWoVevNObBo6XNrFaVkxyaxkLsA1wfEe+SurRaCa5aMrNaVk5yuEXS\n08AOwERJA4APKxtW7Rs5Eu6/H95/P+9IzMzartXkEBEnAbsCO0TEAtKKcPtVOrBa168fDB0KE3Kb\n3NzMrP1KTZ/xxYiYKOlAsmok6eNhXQHc2AXx1bSmqqX99887EjOztllmcgA+T1r1bRQttzE4ObRi\n1Cg46yxYsgR6uNOwmdUQT59RYVtsAZdfDp/9bN6RmFl35ukzqox7LZlZLXJyqLB993VyMLPaUzI5\nSOohaZeuCqYeffaz8NpracS0mVmtKJkcImIJcEkXxVKXevaEvfZy6cHMaks51Up3S/pKQTfWskha\nXtIkSVMkTZd0Vnb8q5KmSVosaftm14yW9JykpyWNaMvzqpnbHcys1rTaW0nSe0BfYDFLR0ZHRKzS\n6s2lvhHxgaRewAPA8aT1p5cAvwN+EhGPZecOAa4GdgTWBe4GNs1KL4X3rKneSgDz5sG668Ls2Wmt\nBzOzrtbpvZUiYqWI6BERvSNi5ezVamLIrv0g2+wD9ATejoinI+LZFk7fD7gmIhZGxEzS2tVDy/tr\nVLeVV4add4a77so7EjOz8pQzZXcPSd+SdFq2v76ksj60s2unAHOAeyJieonT1wEKJ7meRSpB1AX3\nWjKzWlJqhHSTS0jVQLsD/wO8lx37TGsXZlVC20rqB9wpqSEiGtsQX4v1R2PHjv14u6GhgYaGhjbc\nMh/77AOnnw6LF6dGajOzSmpsbKSxsbHd15fT5jA5IrZr+jM7NjUitmnTg6RTgfkRcX62fw+fbHM4\nCSAizs72xwNjImJSs/vUXJtDk6FD4fjj4WtfyzsSM+tuKjFCeoGkj7/rSlqTVJJoLZA1JPXPtlcA\nhgOTm59WsH0zcLCkPpIGA5sAD5cRX8244AL48Y9h7ty8IzEzK62c5HARcBMwQNKZwD+As8q4biDw\n96zNYRJwSzbL65clvQLsBNwm6Q6ArD3iL8B04A7gyJotIizDsGGw554wZkzekZiZlVbWxHuSNge+\nmO1OjIinKhpV6VhqOme8+WaajG/8eNhuu7yjMbPuoq3VSuW0OfwMuBd4MCJyX9es1pMDwGWXwe9/\nDw8+6Km8zaxrVKLN4QXg68C/JD0s6ReSvHxNB3z726nH0u9/n3ckZmYtK3s9B0lrAweRRjmvGhEr\nVTKwEnHUfMkB4PHHYY894MknYcCAvKMxs3pXiWqly4DNSQPZHgDuByZHxMKOBNpe9ZIcIHVrfeMN\nuOKKvCMxs3pXiWql1UiD5d4B3gbezCsx1JuxY+Gee6AD41TMzCqiLdVKmwMjgWOBnhGxXiUDKxFH\n3ZQcAG66CU45BaZMgT598o7GzOpVJaqVRgGfy179gYeA+yPi8o4E2l71lhwi0pTeu+4Ko0fnHY2Z\n1atKJIdfA/eREsKrHYyvw+otOQC8+CLsuCM88ggMHpx3NGZWjzo9OWQ3XZu0zkIAD0fE6+0PsWPq\nMTkAnHlmGvdwyy3QtmWVzMxa1+kN0pK+Rpr+4qukrqwPS/pq+0O0lhx/PDz/PIwbl3ckZmblVSs9\nDuzRVFrIJt6bGBFbd0F8LcVTlyUHSD2XDj0Upk+HlXIZRWJm9aoSXVkFvFGw/xafnE3VOsluu0FD\nQ1r3wcwsT+WUHM4DtiGt7yxS1dLjEXFC5cNrMZ66LTkAvP46bLklTJwIW22VdzRmVi8q0VtJwAHA\nMFKD9P0RcVOHouyAek8OAL/9LfzpT3DffZ6Yz8w6R0V6K1WT7pAcliyBnXeG738fvvOdvKMxs3rQ\naclB0nssYw1nICJilXbE12HdITkATJ4MI0fCtGmwxhp5R2Nmtc4lhzpyzDHw3ntp/Qczs47ozJLD\nyhExr5WHtXpOZ+tOyWHuXBgyBK67Lk2vYWbWXp3ZlfUmSb+WNELSagUPWF3SnpJ+Q1pb2ipklVXg\nggvgv/4LFnoeXDPrQiWrlSTtTloFbldgnezwq6R1Hf4cEY2VDrCFmLpNyQHSxHwjR8Lw4WkUtZlZ\ne7jNoQ7NmAE77QSPPQbrr593NGZWiyoxQtpytvHGcNRRqYHazKwrODnUiBNPTOtN33pr3pGYWXfg\naqUaMmFCGhg3bRr07Zt3NGZWSzqtWilrjG7aHtzsvQPaF551xPDhqe3hZz/LOxIzq3elxjlMjojt\nmm+3tN+VunPJAeC112DrreHee9MYCDOzcrhBus4NHAinnQZHHpm6uZqZVYKTQw068kiYNw+uuirv\nSMysXpWqVnoXuJe0hsPngPsL3v5cRPQveWNp+ez65YA+wN8iYnQ22vo64FPATOBrEfFOds1o4HBg\nMXB0RNzVwn27dbVSk0cegVGj0qpxq63W+vlm1r115txKDaUuLGd0tKS+EfGBpF6kUdXHA/sCb0bE\nuZJOBFaNiJMkDSEtKLQjsC5wN7BpRCxpdk8nh8wPfwiLF6f1H8zMSqnYCGlJfYAtgNlN60m3Iai+\npFLEYcANwBciYo6ktYHGiNgsKzUsiYhzsmvGA2Mj4qFm93JyyLzzTmqUvvHG1IvJzGxZOrMr6+8k\nbZlt9wOmAlcCUyR9vcxgekiaAswB7omIacBaETEnO2UOsFa2vQ4wq+DyWaQShC1D//5w3nlpYr5F\ni/KOxszqSa8S730uIo7Itr8NPBMR+2ff9seTqoBKyqqEts2Sy52Sdmv2fkgqVQxo8b2xY8d+vN3Q\n0EBDQ0NrodStr38dLr8cLr4Yjj0272jMrFo0NjbS2NjY7uvLHedwO/DXiPhDtj8lIrZt04OkU4H5\nwHeBhoj4t6SBpBLFZpJOAoiIs7PzxwNjImJSs/u4WqmZp5+GYcNg6lRY12UtM2tBZ45zeFfSKEnb\nA7uQSgtI6g0sX0Yga0jqn22vAAwHJgM3A4dmpx0KjMu2bwYOltQnG5G9CfBwuX+R7myzzVLV0rHH\neuyDmXWOUiWHTwMXAmsDv4yIP2bHRwLDI+InJW8sbQVcQUpAPYCrIuK8rCvrX4D1Ke7KejKpK+si\n4JiIuLOF+7rk0IL582GXXWDVVeH882H77fOOyMyqiddz6MYWLYJLL4XTT0/zMP385zBoUN5RmVk1\n6MxxDheRGoRbullExNHtC7FjnBxaN28enHsuXHIJHHEEnHRSWnLUzLqvzmxz+C/SyOhXgX9lr0cL\nXlalVl4ZzjgjNVC/+ipsumlKFF6H2szKVarksAbwVeBrpOksriP1WHqn68JrMS6XHNpoypS0/vTs\n2XDOOWnaDZX9/cHM6kFF2hwkrQccDPwYODEicpvyzcmhfSJg/PiUJAYMSI3WO+yQd1Rm1lU6fcpu\nSTsAxwDfBO7AVUo1SYIvfSlVNR1ySCo9fOtb8PLLeUdmZtWo1PQZZ0h6FDiONC/SjhHxnYiY3mXR\nWafr1SstNfrMMzB4MGy3HYweDe++m3dkZlZNSrU5LAFeBD5o4e2IiK0rGdiyuFqpc82enRYPuu02\nOPXUlDh69847KjPrbJ3ZlXWDEtdFRLzUttA6h5NDZUydCj/9Kbz0UuoGu+++brQ2qycVHwQnSaRR\nzde1NbjO4ORQORFw550pSay2GvziF/CZz+QdlZl1hs6csnslST+RdImkI7Ppt78MTAO+0RnBWnWR\nYOTI1PX1W99KpYdvfCOVJsyseynVW+lKYCvSOg5fBB4iNU5/PSL27YLYLCc9e8J3vwvPPgubbJLm\naTrxRDdam3UnpdocHm9qdJbUE3gN+FREzO/C+FqKy9VKXezVV1Oj9c03w9FHw49+lBYaMrPa0Znj\nHBY3bUTEYtLyoLkmBsvHOuukCf3uuw9mzICNN4aTT4bX27RYrJnVklLJYWtJ85pewFYF+3O7KkCr\nHpttBn/8I/zrX2n96s02g2OOgVmzWr3UzGrMMpNDRPSMiJULXr0Ktj3HZze2wQZpIr9p06BPH9hm\nG/je91KpwszqQ6vTZ5gty8CBcN55qeF6nXVg553TmtZPPpl3ZGbWUU4O1mGrr54WGHr++VSK2GMP\n2H9/eOSRvCMzs/ZycrBOs8oqqcvriy+mBHHggTBiBDQ2em1rs1rjZUKtYhYsgD/9Cc4+G9ZcE045\nJc0M62k5rFZEpJ/jjz5Ki2jV8s+u15C2qrN4MVx/PZx5JvTokbrBHnBAGmxn1lFvvAFPPZX+/PDD\nT77mz+/YsY8+ShNR9u4Nq64Ke++dXl/8IvTtm/ffvG2cHKxqRcCtt8LPf566wo4enRqwPQustSYi\ndZl+6imYPv2Tfy5eDJtvDmuvDcsvn14rrLB0u5z9ZZ2z3HLpC01Emub+ttvSz/Cjj8KwYUuTxQYb\n5P0v1DonB6t6EfD3v6eSxPPPwwknwOGHp19I694WL4YXXvjkh3/Ta6WVYMiQlAg233zp9lprdX11\nzzvvwF13pWRxxx1pdcV99kmJYued07op1cbJwWrKQw+lksSjj6YBdd/8Jqy7bt5RWaV99BE891xx\nKWDGjPRhX/jh3/RaddW8o27Z4sWpZ15TqeLll2HPPVOiGDky9earBk4OVpOmToULL4Rx42CLLeDg\ng+ErX0nfyKzyFixIs+8uXgyLFqU/C18tHWvLuYsWwb//vTQRvPJKqoppngQ+/WlYccW8/zU6ZvZs\nuP32lCjuuQe23jolin32gS23zK9R28nBatpHH6Xi+nXXpV+uz3wmJYoDDkhrTFSTJUtSUhs/Pq2D\nMXdu6pk1YkTekbXN7benUtvChamOvWfPVC3Ss+cnXx051rNnSvRNiWDjjdPo+nr34YepK/dtt6XX\nokVLq5923z21a3QVJwerG/Pnpw+ua69NCWPYMDjoINhvP+jXL5+YXn8dJkxICeGuu1Ice+6ZXgsX\npoWSttwSLrgANtwwnxjL9fzzcNxx6dv8r34Fe+2Vd0T1LSL9Wzclisceg89/fmmj9vrrV/b5Tg5W\nl+bNg1tuSSWKxsb0reugg2DUqMpWQyxcCP/859LSwYwZsNtuSxNC8wTw4Yfwy1/C+efDD36QemRV\nWzXJBx+kEs6vfw3HHw8//nEqMVjX+s9/0heMW29NP1+33QZDh1bueU4OVvfeeSe1TVx7bfrgHjky\nVT196Uud0+PpxRdTIhg/PiWijTZamgx23rm86pBZs9Jo8fvuS/NPHXRQ/gOoItK/23HHwWc/mxLY\noEH5xmTJ4sXp56NHBeesqJrkIGkQaTW5AUAA/xcRF0raBvgtsCIwE/hGRMzLrhkNHE5aS+LoiLir\nhfs6OdjH3nwTbrghlSgmT071uQcfDMOHl1+n/f77KQk0lQ7efTe1G+y5Z7rPWmu1P74HHoCjjkqj\nay+8ELbdtv336ohnnkkLNc2aBRddlEpe1r20NTkQERV5AWsD22bbKwHPAJsDjwCfy45/G/ifbHsI\nMAXoDWwAzAB6tHDfMGvJq69GXHhhxK67Rqy2WsR3vhNx110RCxd+8rwlSyKmTo0455yI3XePWHHF\niC98IeLMMyMeeyxi8eLOjWvRoojf/jZiwICIH/wg4s03O/f+pcydG3HCCRGrrx5xwQURCxZ03bOt\numSfnWV/hndZtZKkccDFwPUR0T87NggYHxFbZKWGJRFxTvbeeGBsRDzU7D7RVTFb7Xr5ZfjrX1PV\n08svp0kAd9wR7r031fOusMLSqqLddkuTBlba22/DmDGplDNmDBxxROUGS0Wk5/z0p6mUcM45aQSx\ndV9VU630iYdIGwD3AlsC44FzI+Jvkn5MSgCrSLoIeCgi/pxdcylwR0Tc0OxeTg7WJs8/nz4op0xJ\nvUNGjkxdKfPyxBOpiuett1JVU0ND59//qKNS9djFF8Ouu3bu/a02tTU5VHyQt6SVgOuBYyJinqTD\ngQslnQrcDCwocbmzgHXYRhulyf6qxVZbpelDbrgBDj0UdtopNVp3tCvjO+/A2LFw9dXpzyOO8OSG\n1n4VTQ6SegM3AH+KiHEAEfEMsGf2/qbA3tnps4HCvhPrZceKjB079uPthoYGGjr7q5dZhUlpBPhe\ne8G558J228Gxx6aupW0dGLVkCVx5Zeo2O2pUWr51zTUrE7fVjsbGRhobG9t9fSV7Kwm4AngrIo4r\nOL5mRLwhqQfwR+DvEfFHSUOAq4GhwLrA3cDGzeuQXK1k9WjmzJQYHn00DaDbf//yur4+9hj86Eep\nK+TFF6d2FbOWVE2bg6RhwH3A4yytHjoZ2AT4YbZ/Q0ScXHDNyaSurItI1VB3tnBfJwerWxMnpqks\nBg5Mo5aHDGn5vLfegv/+b7jppjS77WGHVbaPvNW+qkkOleLkYPVu4UL4zW/gjDPSLLVjxkD//um9\nxYvh0kvhtNPSwLrTT6/e2Uqtujg5mNWJN95IS6vecgv87Gdpwrqjj04rkF10EWyzTd4RWi1xcjCr\nM48+mrqmzpyZprw45JD8p+Kw2uPkYFaHmn7knRSsvapunIOZdZyTgnU1928wM7MiTg5mZlbEycHM\nzIo4OZiZWREnBzMzK+LkYGZmRZwczMysiJODmZkVcXIwM7MiTg5mZlbEycHMzIo4OZiZWREnBzMz\nK+LkYGZmRZwczMysiJODmZkVcXIwM7MiTg5mZlbEycHMzIo4OZiZWREnBzMzK+LkYGZmRZwczMys\niJODmZkVqVhykDRI0j2Spkl6UtLR2fGhkh6WNFnSI5J2LLhmtKTnJD0taUSlYjMzs9IqWXJYCBwX\nEVsAOwE/lLQ5cC5wakRsB5yW7SNpCHAQMAQYCVwiqe5KNo2NjXmH0G61HDs4/rw5/tpSsQ/fiPh3\nREzJtt8DngLWBV4D+mWn9QdmZ9v7AddExMKImAnMAIZWKr681PIPWC3HDo4/b46/tvTqiodI2gDY\nDngIeA54QNL5pOS0c3baOtn7TWaRkomZmXWxilfbSFoJuB44JitBXAYcHRHrA8cBl5e4PCodn5mZ\nFVNE5T5/JfUGbgXuiIj/zY7NjYhVsm0B70REP0knAUTE2dl744ExETGp2T2dMMzM2iEiVO65FatW\nyj74LwOmNyWGzAxJX4iIe4HdgWez4zcDV0u6gFSdtAnwcPP7tuUvZ2Zm7VPJNoddgW8Cj0uanB07\nGfg+8GsfsZPGAAAGsElEQVRJywHzs30iYrqkvwDTgUXAkVHJYo2ZmS1TRauVzMysNtXMOAJJI7PB\ncc9JOjHveNpiWQMCa42kntngxVvyjqWtJPWXdL2kpyRNl7RT3jG1RTZAdJqkJyRdnZW8q5akyyXN\nkfREwbHVJE2Q9KykuyT1zzPGUpYR/3nZz89USTdK6lfqHnlpKfaC934iaYmk1Vq7T00kB0k9gYtJ\ng+OGAIdkA+pqxbIGBNaaY0jVfrVY3PwVcHtEbA5sTRp3UxOyruDfA7aPiK2AnsDBecZUhj+Qfl8L\nnQRMiIhNgYnZfrVqKf67gC0iYhtSW+noLo+qPC3FjqRBwHDgpXJuUhPJgTQYbkZEzIyIhcC1pEFz\nNWEZAwLXyTeqtpG0HrAXcClQU50Csm94n4uIywEiYlFEvJtzWG0xl/QFo6+kXkBflg4erUoRcT/w\nn2aH9wWuyLavAPbv0qDaoKX4I2JCRCzJdicB63V5YGVYxr89wAXACeXep1aSw7rAKwX7NTtArmBA\n4KTSZ1adXwI/BZa0dmIVGgy8IekPkh6T9HtJffMOqlwR8TbwC+Bl4FVS9++7842qXdaKiDnZ9hxg\nrTyD6aDDgdvzDqJckvYDZkXE4+VeUyvJoRarMYq0MCCwJkjaB3g9IiZTY6WGTC9ge+CSiNgeeJ/q\nrtL4BEkbAccCG5BKnCtJ+kauQXVQ1hOxJn+vJZ0CLIiIq/OOpRzZF6GTgTGFh1u7rlaSw2xgUMH+\nIFLpoWZkAwJvAP4UEePyjqeNdgH2lfQicA2wu6Qrc46pLWaRvjU9ku1fT0oWteIzwIMR8VZELAJu\nJP2f1Jo5ktYGkDQQeD3neNpM0mGk6tVaSs4bkb5YTM1+h9cDHpU0oNRFtZIc/gVsImkDSX1Is7fe\nnHNMZSsxILAmRMTJETEoIgaTGkL/HhH/L++4yhUR/wZekbRpdmgPYFqOIbXV08BOklbIfpb2IHUM\nqDU3A4dm24cCNfUlSdJIUtXqfhHxYd7xlCsinoiItSJicPY7PIvUuaFkcq6J5JB9W/oRcCfpl+K6\niKiZ3iYsHRC4W9YVdHL2g1ararE64Cjgz5KmknornZlzPGWLiKnAlaQvSU11xv+XX0Stk3QN8CDw\naUmvSPo2cDYwXNKzpNkRzs4zxlJaiP9w4CJgJWBC9jt8Sa5BLkNB7JsW/NsXKuv314PgzMysSE2U\nHMzMrGs5OZiZWREnBzMzK+LkYGZmRZwczMysiJODmZkVcXIwM7MiTg5mbSCpj6S7s0FQX23H9fvV\n6HTt1s1UcplQs3q0PWneuO3aef2XgVtow3oSknplswSYdRmXHKwuZPNuPZ1Ny/2MpD9LGiHpH9nK\nYztmrwezabv/0TTXkqTjJF2WbW+Vrba2fAvPGABcBeyYlRw2lLSDpEZJ/5I0vmBiue9JeljSlGwF\nuhUk7QKMAs7LYtgwu3aH7Jo1sonRkHSYpJslTSRN19A3W+FrUnbtvtl5W2THJmcrlG3cBf/c1h1E\nhF9+1fyLNOvkQmAL0nTE/wIuy97bF7iJNC9Oz+zYHsD12baAe0nf6h8Bdi7xnC8At2TbvUlz2Kye\n7R9U8MzVCq45A/hRtv0H4ICC9+4hTYIGsAbwYrZ9GGkNk/7Z/pnAN7Lt/sAzpEV/LgS+nh3vBSyf\n9/+FX/XxcrWS1ZMXI2IagKRpQNOCOE+Skkd/4Krs23WQPtyJiMimYn4C+E1E/LPEMwrnwf80KRnd\nnSZLpSdpMR6ArST9DOhHSkrjl3GPUiZExDvZ9ghglKTjs/3lgPWBfwKnZCv13RgRM8q8t1lJTg5W\nTz4q2F4CLCjY7kX6Bj8xIr4s6VNAY8H5mwLzaNsKgwKmRURLayv8Edg3Ip6QdCjQUPBe4WyXi1ha\nvdu8Kuv9ZvsHRMRzzY49LekhYB/gdklHRMQ9bfg7mLXIbQ7WXQhYhaXf7D+exjhbY/pXwOeA1SUd\nWOY9nwHWlLRTdp/ekoZk760E/Dtb5OmbLE0I87I4mswkLeYD8JUSz7oTOLog5u2yPwdHxIsRcRHw\nN2CrMmM3K8nJwepJ8/nnC/eXAOcBZ0l6jFQF1PT+BcDFWZXMd4CzJa1R4hkBEBELSB/o50iaAkwG\nds7OO5W0TvgDfLJn0rXATyU9KmkwcD7wgyym1Qtiar6M5hlAb0mPS3oSOD07/jVJT0qaTKriqqUV\n+qyKeT0HMzMr4pKDmZkVcYO0WQuy3kvHNDv8QEQclUM4Zl3O1UpmZlbE1UpmZlbEycHMzIo4OZiZ\nWREnBzMzK+LkYGZmRf4/NxTupiNNzpgAAAAASUVORK5CYII=\n",
      "text/plain": [
       "<matplotlib.figure.Figure at 0x1860e128>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "# plot max_features (x-axis) versus RMSE (y-axis)\n",
    "plt.plot(feature_range, RMSE_scores)\n",
    "plt.xlabel('max_features')\n",
    "plt.ylabel('RMSE (lower is better)')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 38,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(288.41877774269841, 8)"
      ]
     },
     "execution_count": 38,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# show the best RMSE and the corresponding max_features\n",
    "sorted(zip(RMSE_scores, feature_range))[0]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Fitting a Random Forest with the best parameters"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 39,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "RandomForestRegressor(bootstrap=True, criterion='mse', max_depth=None,\n",
       "           max_features=8, max_leaf_nodes=None, min_samples_leaf=1,\n",
       "           min_samples_split=2, min_weight_fraction_leaf=0.0,\n",
       "           n_estimators=150, n_jobs=1, oob_score=True, random_state=1,\n",
       "           verbose=0, warm_start=False)"
      ]
     },
     "execution_count": 39,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# max_features=8 is best and n_estimators=150 is sufficiently large\n",
    "rfreg = RandomForestRegressor(n_estimators=150, max_features=8, oob_score=True, random_state=1)\n",
    "rfreg.fit(X, y)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 40,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>feature</th>\n",
       "      <th>importance</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>7</th>\n",
       "      <td>League</td>\n",
       "      <td>0.003402</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>12</th>\n",
       "      <td>NewLeague</td>\n",
       "      <td>0.003960</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>8</th>\n",
       "      <td>Division</td>\n",
       "      <td>0.007253</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>10</th>\n",
       "      <td>Assists</td>\n",
       "      <td>0.024857</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>11</th>\n",
       "      <td>Errors</td>\n",
       "      <td>0.026147</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>HmRun</td>\n",
       "      <td>0.041620</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>9</th>\n",
       "      <td>PutOuts</td>\n",
       "      <td>0.058637</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>Runs</td>\n",
       "      <td>0.070350</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>AtBat</td>\n",
       "      <td>0.096424</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>RBI</td>\n",
       "      <td>0.133953</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>Hits</td>\n",
       "      <td>0.143183</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>Walks</td>\n",
       "      <td>0.145255</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>6</th>\n",
       "      <td>Years</td>\n",
       "      <td>0.244960</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "      feature  importance\n",
       "7      League    0.003402\n",
       "12  NewLeague    0.003960\n",
       "8    Division    0.007253\n",
       "10    Assists    0.024857\n",
       "11     Errors    0.026147\n",
       "2       HmRun    0.041620\n",
       "9     PutOuts    0.058637\n",
       "3        Runs    0.070350\n",
       "0       AtBat    0.096424\n",
       "4         RBI    0.133953\n",
       "1        Hits    0.143183\n",
       "5       Walks    0.145255\n",
       "6       Years    0.244960"
      ]
     },
     "execution_count": 40,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# compute feature importances\n",
    "pd.DataFrame({'feature':feature_cols, 'importance':rfreg.feature_importances_}).sort('importance')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 41,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "0.53646364056364049"
      ]
     },
     "execution_count": 41,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# compute the out-of-bag R-squared score\n",
    "rfreg.oob_score_"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Reducing X to its most important features"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 42,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(263, 13)"
      ]
     },
     "execution_count": 42,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# check the shape of X\n",
    "X.shape"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 43,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "(263L, 4L)\n",
      "(263L, 5L)\n",
      "(263L, 7L)\n"
     ]
    }
   ],
   "source": [
    "# set a threshold for which features to include\n",
    "print rfreg.transform(X, threshold=0.1).shape\n",
    "print rfreg.transform(X, threshold='mean').shape\n",
    "print rfreg.transform(X, threshold='median').shape"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 44,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "# create a new feature matrix that only includes important features\n",
    "X_important = rfreg.transform(X, threshold='mean')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 45,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "284.82790842153145"
      ]
     },
     "execution_count": 45,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# check the RMSE for a Random Forest that only includes important features\n",
    "rfreg = RandomForestRegressor(n_estimators=150, max_features=3, random_state=1)\n",
    "scores = cross_val_score(rfreg, X_important, y, cv=10, scoring='mean_squared_error')\n",
    "np.mean(np.sqrt(-scores))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Comparing Random Forests with decision trees\n",
    "\n",
    "**Advantages of Random Forests:**\n",
    "\n",
    "- Performance is competitive with the best supervised learning methods\n",
    "- Provides a more reliable estimate of feature importance\n",
    "- Allows you to estimate out-of-sample error without using train/test split or cross-validation\n",
    "\n",
    "**Disadvantages of Random Forests:**\n",
    "\n",
    "- Less interpretable\n",
    "- Slower to train\n",
    "- Slower to predict"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "![Machine learning flowchart](images/driver_ensembling.png)\n",
    "\n",
    "*Machine learning flowchart created by the [second place finisher](http://blog.kaggle.com/2015/04/20/axa-winners-interview-learning-telematic-fingerprints-from-gps-data/) of Kaggle's [Driver Telematics competition](https://www.kaggle.com/c/axa-driver-telematics-analysis)*"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 2",
   "language": "python",
   "name": "python2"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 2
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython2",
   "version": "2.7.6"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 0
}
